








o ae 





VOLUME 42, Parts 1 AND 2 JUNE 1955 





STUDIES IN THE HISTORY OF PROBABILITY AND STATISTICS* 
I. DICING AND GAMING (A NOTE ON THE HISTORY OF PROBABILITY) 


By F. N. DAVID 
University College, London 


‘See, this is new? It hath been already of old time.’ (Ecclesiastes i. 10.) 


1. A cynical archaeologist remarked recently that a symptom of decadence in a civiliza- 
tion is when men become interested in their own history, and he added that in the unlikely 
eventuality of any proof being required of the decadence of this phase of Homo sapiens it 
could be found in the present-day interest in archaeology. Most generalizations of a 
sweeping character such as this are unacceptable, chiefly because there is no way of putting 
them to proof; but the present interest of scientists in general, and of statisticians in par- 
ticular, in the origins of scientific thought, is far from implying the decadence of science, 
whatever may be implied by an interest in the arts. 

It is inviting, and at the same time profitless, to speculate why modern scientists have 
such an interest. The possibility of deciding priority of discovery which concerned the 
Victorian scientist so closely does not cause much controversy to-day, for the modern 
scientist would hold that to ascribe any discovery in the field of science to any single person 
is unrealistic. Thus, while we are taught at school, for example, that Newton and Leibnitz 
separately and independently ‘discovered’ the differential calculus, it would perhaps be 
more appropriate to say that Newton and Leibnitz each supplied the last link in the chain 
of reasoning which gave us the differential calculus—a chain which can be traced back 
through Pierre Fermat, Barrow, Torricelli and Galileo, and that it is surprising that there 
were only two mathematicians who did this. 

Mathematics is essentially an expression of thought in which we build on the mentai 
effort of our forerunners, and probability is no exception to this general rule. The real 
difficulty we meet with in trying to trace probability back to its origins is that it started 
essentially as an empirical science and developed only lately on the mathematical side. It 
is hard to say where in time the change came from empiricism to mathematical formalism 
as it appears to have taken place over hundreds of years; and the claims put forward for 
Pascal and Fermat as the creators of probability theory cannot entirely be substantiated. 


2. When man first started to play games of chance is a time problem we shall never 
clearly resolve. We may place on record that it is a commonplace thing for archaeologists 
to find a preponderance of astragalit among the bones of animals dug up on prehistoric 
sites. One archaeologist stated that he had found up to seven times as many as any other 
bone, another put the figure at 500 (sic!), while yet a third, refusing to be drawn to a figure, 
stated that they were many. This fact has probably little significance. The astragalus has 
little marrow in it and was possibly not worth cracking for the sake of its contents as were 
the long bones; it is knobbly and presents no flat curves for drawing as does the shoulder 

* [Editorial note. It is hoped to publish articles by a number of different authors under this 


general heading.] 
+ The astragalus is a small bone in the ankle, immediately under the talus or heel-bone. See Pl. 2a. 
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blade for example. All we may do is to place on record that round about 40,000 years ago 
there were large numbers of the astragali of sheep, goat and deer lying about. 

The astragali of animals with hooves are different from those with feet such as man, dog 
and cat. From the comparison in Text-fig. 1 we note how in the case of the dog the 
astragalus is developed on one side to allow for the support of the bones of the feet. The 
astragalus of the hooved animal is almost symmetrical about a longitudinal axis and it is 
a pleasant toy to play with. In France and Greece children still play games with them in 
the streets, and it is possible to buy pieces of metal fashioned into an idealized shape but 
still recognizable as astragali. 





Sheep 


Text-fig. 1. Drawings of the astragalus in sheep and dog, natural size. 


3. Some time between prehistoric man of four hundred centuries ago and the beginning 
of the third millennium before Christ Homo sapiens invented games and among these 
games, games of chance. We know from paintings, terra-cotta groups, etc., that the 
astragalus was used in Greece like the ancient quoit,* but there is no doubt from paintings 
on tombs in Egypt and excavated material that the use of the astragalus in games where 
it is desired to move counters was well established by the time of the First Dynasty. In 
one painting a nobleman, shown playing a game in his after-life, delicately poises an 
astragalus on his finger tip, a board with ‘men’ in front of him. A typical game of 
c. 1800 B.c. is that of ‘Hounds and Jackals’ illustrated in Pl. 1. The game seems similar 
to our present-day ‘Snakes and Ladders’; the hounds and jackals were moved according to 
some rule by throwing the astragali found with the game and shown in the figure. Variants 
of this game were undoubtedly played from the time of the First Dynasty (c. 3500 B.c.). 

It is possible but not altogether likely that these games originated in Egypt. They cer- 
tainly did not originate in Greece, as has been claimed for reasons which we shall give later. 
However, Herodotus, the first Greek historian, like his present-day counterparts, was 
willing to believe that the Greeks (or allied peoples) had invented nearly everything. His 
claim that the Lydians introduced coinage has about as much foundation as his claim 
regarding games of chance. He writes (c. 500 B.c.) about the famine in Lydia (which was 
c. 1500 B.c.) as follows: 

The Lydians have very nearly the same customs as the Greeks. They were the first nation to 
introduce the use of gold and silver coins and the first to sell goods by retail. They claim also the 
invention of all games which are common to them with the Greeks. These they declare that they 
invented about the time that they colonized Tyrrhenia, an event of which they give the following 


account. In the days of Atys, the son of Manes, there was great scarcity through the whole land of 
Lydia. For some time the Lydians bore the affliction patiently, but finding that it did not pass away, 


* From the name ‘knucklebone’ we might infer that among the early games were those in which the 
astragali were balanced on the bones of the knuckles and then tossed and caught again. 
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The game of Hounds and Jackals. 
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they set to work to devise remedies for the evil. Various expedients were discovered by various 
persons; dice and huckle-bones (i.e. astragali) and ball and all such games were invented, except 
tables,* the invention of which they do not claim as theirs. The plan adopted against famine was to 
engage in games on one day so entirely as not to feel any craving for food, and the next day to eat 
and abstain from games. In this way they passed eighteen years. 


In yet another commentary we are told that games of chance were invented during the 
Trojan war by Palamedes. During the 10-year investment of the city of Troy various 
games were invented to prevent the soldiers’ morale suffering from boredom. 


4. The game of ball is mentioned by Homer and according to Plato was evolved in 
Egypt. It is not, however, a game of chance. The story of dice we shall return to, but we 
may first carry the story of the astragalus a little further. In the early part of the first 
millennium it would seem that astragali were used by both adults and children for their 
leisure games. Homer (c. 900 B.c.) tells us that when Patroclus was a small boy he became 
so angry with his opponent while playing a game of knucklebones that he nearly killed him. 
Another writer of the same period tells us that students played knucklebones everywhere, 
that they were acclaimed as presents and that as a prize for handwriting one student was 
given eighty astragali all at once! It is not difficult to imagine the small boys of that era 
collecting astragali as they collected marbles, much as the boys of our own era still do. 

That the astragalus was used commonly in the gaming which the Greeks and later the 
Romans conducted with zeal and passion, the references in the literature of that time leave 
no room for doubt. One of the chief games may have been the simple one of throwing four 
astragali together and noting which sides fell uppermost. The astragalus has only four sides 
on which it may rest, and it has been deduced, among others by Nicolas Leonicus Thomeus 
(1456-1531), that a common method of enumeration was that the upper side, broad and 
slightly convex counted 4, the lower side broad and hollowed 3, the lateral side narrow and 
flat 1 and the other lateral which is slightly hollow 6. These aspects of a sheep’s astragalus 


. are shown in Pl. 2a. (With present-day astragali the probabilities of scoring | and 6 are each 


approximately equal to 1/10 and those of 3 and 4 approximately 4/10.) The worst throw 
for the Greeks with one bone was unity which they called the dog, and sometimes the 
vulture. The best of all throws with four knucklebones was the throw of Venus when all 
four sides were different. which has an actual probability of about 1/26. But at different 
times and in different games the numbers must have been varied, for the throw of Euripides 
with four astragali, discussed by several fifteenth-century writers, was worth 40. How the 
bones fell to achieve this result is not stated, although Cardano writing in the sixteenth- 
century states that it was four fours. (Probability c. 1/39.) 


5. In classical Rome the astragalus was imitated in carved stone with figures and scenes 
incised on the sides. A typical example is illustrated in Pl. 26. Stone astragali have also 
been found in Egypt. At this time too we have the production of lewd figures in metal or 
bone varying in size from about | cm. to over | in. in height. That these figures were used 
for gaming may be deduced from the fact that the six possible positions in which the figure 
may fall are each marked with a number of dots. 

Besides the astragali it appears possible that throwing sticks were also used for games of 
chance, although it may be that they had a greater religious significance ; we shall return 


* This may have been an early form of backgammon or may have been shuffle-board. 


+ I have not tested these figures for bias. They are a development, I think, of dice rather than 
astragali. 
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to this point later. The throwing stick was made of wood or ivory and was often approxi- 
mately 3 in. in length with cross-section when square of about 1 cm. each way. Such 
throwing sticks were known to the ancient Britons, to the Greeks, the Romans, the 
Egyptians and to the Maya Indians of the American continent. Sometimes the sticks are 
elliptical in cross-section with major axis of approximately 1 cm., but they are all alike in 
having only four numbers on them, one at each end of the upper face and one at each end 
of the lower parallel face. In the European throwing sticks the majority of numbers are 
marked by small engraved circles, but they are sometimes indicated by cuts in the wood 
or ivory. The Maya throwing sticks are marked by coloured scratches in ivory. The actual 
numbers marked vary. They are mostly 1, 2, 5, 6, but 3 and 4 have also been noticed. 
These throwing sticks are of little importance in gaming. They are mentioned because it is 
interesting to note the likelihood that gaming originated at many points, and, although 
this is a remark one could not defend, that it possibly was originally a debasement of a 
religious ceremony. 


6. The six-sided die may have been obtained from the astragalus by grinding it down 
until it formed a rough cube. The Musée de Louvre has several astragali which have been 
treated in this way, but one cannot imagine they formed very satisfactory dice. The 
honeycombed (or cancellous) bone tissue has been exposed in several places and the crude 
die would clearly not have a long life. Whether the die was evolved in this way or not the 
evolution must have taken place some considerable time before Christ. The earliest die 
known was excavated in northern Iraq and is dated at the beginning of the third millennium. 
It is described as being of well-fired buff pottery. The dots are arranged as shown in Text- 
fig. 2(i), the edges at A and B being imagined folded away from the reader. It will be noted 
that the opposite points are in consecutive order, 2 opposite 3, 4 opposite 5 and 6 opposite 1. 

A die excavated in Mohenjo-Daro (Ancient India) is also dated as third millennium, and 
it is also made of hard buff pottery. The order of the points is again consecutive, but this 


time we have | opposite 2, 3 opposite 4, 5 opposite 6. Few other dice have been recorded in ; 


this millennium. At the time of the X VIIIth dynasty in Egypt (c. 1400 B.c.) a die with the 
markings shown in Text-fig. 2 (ii) must have been in play. The arrangement of the five dots 
is unusual. Somewhere about this time, however, the arrangement of the numbers settled 
down to the familiar two-partitions of 7 opposite one another as shown in Text-fig. 2 (vi), 
which arrangement has persisted until to-day. Out of records, coliected by the present 
writer, of some fifty dice of classical times made of crystal, ivory, sandstone, ironstone, 
wood and other materials, forty had the two-partition of 7 arrangement. A twelfth-century 
(a.D.) Greek bishop wrote that this was the way in which a die should be marked, and a 
sixteenth-century gambler theorizes that this arrangement was chosen to make it easy to 
check whether all the numbers had been marked on the die and no figure duplicated at the 
expense of leaving out another. One die of the first millennium is said to have 9 opposite 6, 
5 opposite 3, and 4 opposite 2. It may have been especially made for a particular game; 
alternatively, it is possible that it may have some ceremonial significance. This is possibly 
also true of a die marked as in Text-fig. 2(iii), although it might have been a die used for 
cheating. 


7. That dice were used in Egypt is clear from the XVIIIth-dynasty specimen. It is 
thought, however, that dicing did not become common until the advent of the Ptolemaic 
dynasty (300 to 30 B.c.) which originated from Greece. Several dice are known of this 
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period including a beautiful specimen in hard brown limestone of side c. 1 in., which has 
the sacred symbols of Osiris, Horus, Isis, Nebhat, Hathor and Horhudet engraved on the 
six sides. This would almost undoubtedly have been used for some form of divination rites. 
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Dice in Britain were in a very primitive state at the time of Christ. The pieces used then 
were formed by roughly squaring the long bone of an animal and cutting it into sections 
to form objects approximately cubical in shape. The marrow was taken out leaving a hollow 
square cylinder of which the cross-section in diagrammatic form is sketched in Text-fig. 2 (iv). 
(The British Museum has two of these.) These primitive dice had the two partition of 7 
arrangement, 3 being opposite 4 on the hollow ends. Some dice had a 3 on each end and 
no 4, Several dice of this kind have been excavated in the chalk and flint country and dated 
late in the first millennium. 

The working out of the geometry of solid figures by Greek mathematicians appears to 
have been followed almost immediately by the construction of polyhedral dice. A beautiful 
icosahedron in rock crystal now at the Musée de Louvre is the most famous of these. (In 
the diagram of Text-fig. 2(v) it may be imagined the outline is folded away from the 
reader.) A figure with 19 faces badly cut but apparently imagined to be rectangular has a 
roman digit on each face from I to X, and above that the numbers rise by tens to C. The 
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number LX XX is missing, but the number XX appears twice on one face. There was also 
a die of 18 faces probably formed by beating out a cubical die, a die with 14 faces and so on. 
Faked dice were also not unknown. Apart from the device of leaving out one number 
and duplicating another it is stated that hollow dice have been found dating from Roman 
times. The unity on the face of the die forms a small round plate which can be lifted. It is 
suggested that a small ball of leather could be crammed into the hollow of the die through 
this hole in such a way as to cause the die to tend to fall in a predetermined manner. 


8. Gaming reached such popularity with the Romans that it was found necessary to 
promulgate laws forbidding it except at certain seasons. What game was played by the 
common people we do not know, but there are many references. to those played by the 
emperors. In Suetonius’s Life of Augustus (Loeb’s translation) we find: 


He (Augustus) did not in the least shrink from a reputation for gaming:and played frankly and 
openly for recreation, even when he was well on in years, not only in the month of December, but on 
other holidays as well and on working days too. There is no question about this, for in a letter in his 
own handwriting he says, ‘I dined, dear Tiberius, with the same company;. .. We gambled like old 
men during the meal both yesterday and to-day, for when the dice were thrown whoever turned up 
the ‘“‘dog”’ or the 6, put a denarius in the pool for each one of the dice, and the whole was taken by 
anyone who threw the Venus’. 


There are several other references to gaming in this Life. Whether the word talis should be 
translated as dice or astragali (knucklebones) is a moot point. The die as we know it is 
usually referred to as ‘tessera’. The astragalus is often called the talus (or heel-bone), and 
this is the word Suetonius actually used. From the description of the play it would seem 
appropriate to read knucklebones for dice. 

In Suetonius’s Life of Claudius we are told that Claudius was so devoted to dicing that 
he wrote a book about it, and that he used to play while driving, throwing on to a board 
fitted especially in his carriage. From another source we learn that he played right hand 
against left hand. 


9. These two instances are chosen to illustrate the passion for gaming which apparently 
possessed the Romans, and it is possible to cite many others. The question which constantly 
recurs to one while studying these games of the past is ‘Why did not someone notice the 
equi-proportionality property of the fall of the die?’ It is understandable that no theory 
was made to describe the fall of the astragalus. But the Greeks had performed the neces- 
sary abstraction of thought to make the mathematical idealization of the cube (and other 
solid figures) ; at first sight it seems curious that mathematicians did not then go on a little 
further and give equal weight to each side of the cube and so on. For if dicing and gaming 
generally were carried on by so many persons for so long that it was thought necessary to 
prohibit them, surely someone must have noticed that with a cube on the average any one 
side turned up as frequently as any other? We can only make guesses on this point, but it 
would seem to the writer that there are two possible explanations, the imperfections of the 
dice and their use in religious ceremonies. 

10. Imperfect dice. We speak of a true or a fair die nowadays when we mean that there 
is no bias apparent when the die is thrown. In Roman times, and presumably earlier, it seems 
to have been the exception rather than the rule for the die to be true. Many dice of the 
classical period have been thrown by the writer and they were nearly all biased but not all 
in the same way. For example, three classical dice from the British Museum gave the 
results shown in the table from 204 throws each. The arrangement of the pips on the dice 
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were as in Text-fig. 2(vi), (vii) and (viii). The rock crystal is a beautifully made die; the 
others are a little primitive, and the sides of the iron die are only approximately parallel. 
The marble and the iron dice are obviously biased, and this was true of many of the other 
dice examined. A photograph of a wooden die of the classical period is given in Pl. 2(c). 
It will be noted that one of the faces shown is not square, and the impression one has is that 
the owner picked up a piece of wood of convenient shape, smoothed it a little and engraved 
the pips. It would therefore have been difficult, except over a long period, to notice any 
regularity. 




















| | T | | 
Number of pips ... 1 | 2 3 4 | 5 6 | 
| | | 
| dee ! 
Rock crystal 30 38 31 34 34 37 | 
| Iron | 35 | 39 | 30 | 21 | 37 | 42 | 
| Marble | 27 | 28 | 23 | 47 | 25 | 54 | 

{ | 





11. Divination. In spite of the imperfections of the dice it is probable that some theory 
might have been made if magic or religion or both had not been involved also. A scheme 
whereby the deity consulted is given an opportunity of expressing his wishes appears to \e 
a fundamental in the development of all religions. As late as 1737 we have John Wesley 
deciding by the drawing of lots whether to marry or not (John Wesley’s Journal, vol. 1, 
1737, Friday, 4 March), and in the practices of present-day primitive tribes we get an echo 
from the classical era. At that time pebbles of diverse shapes and colours, arrows, astragali 
and dice were all used to probe the divine intention. In the temples there were various and 
varied rites attached to the process of divination by lot, but the main principle was the 
same. The question was posed, the lot was cast, the answer of the god was deduced. The 
dice (astragali, etc.) were thrown sometimes on the ground, sometimes on a consecrated 
table. 

It was customary in classical Greece and Rome for the four astragali of the gamblers to 
be used in the temples. The prediction was that the throw of Venus (1, 3, 4, 6 uppermost) 
was favourable and the dogs unfavourable. In the temple of the oracles tablets were hung 
up and the priest, or possibly the suppliant, interpreted the throw of the four bones by 
reference to the tablets. Cases have been recorded, however, where five astragali were used. 
Greek inscriptions found in Asia Minor give a fairly complete record of how the throws of 
five were interpreted. Each throw was given the name of a god. Thus Sir James Frazer 
translates (commentary on Pausanias): 

1, 3. 3. 4. 4=15 The throw of Saviour Zeus 

One one, two threes, two fours, 

The deed which thou meditatest, go, do it boldly. 

Put thy hand to it. The gods have given these favourable omens. 

Shrink not from it in thy mind. For no evil shall befall thee. 
It is not clear whether the order of the numbers is important or not. If order does not 
matter ther. the probability of this throw is about 0-08.* The tesserae of the gambler were 
also used in divination ceremonies as well as the astragali, and it is possible that the same 
interpretation was given to the numbers falling uppermost, although the presence of the 2 
and the 5 would make this a little awkward. 


* I propose to write at greater length on ‘divination probabilities’ on a further occasion. 
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In addition to the divination carried out by the priests it was apparently a commonplace 
for individuals to perform acts of divination with regard to events in their daily lives. Thus 
Lucian, telling the story of the young man who fell madly in love with Praxiteles’s Venus 
of Cnidos, writes: 

He threw four knucklebones on to the table and committed his hopes to the throw. If he threw 
well, particularly if he obtained the image of the goddess herself, no two showing the same number, 
he adored the goddess, and was in high hopes of gratifying his passion: if he threw badly, as usually 


happens, and got an unlucky combination, he called down imprecations on all Cnidos, and was as 
much overcome by grief as if he had suffered some personal loss. 


Again we have from Propertius: 


When I was seeking Venus (i.e. good fortune) with favourable tali, the damned dogs always leaped 
out. 

12. Itis perhaps of interest here to interpolate a note on divination as reported practised 
by the Buddhists of present-day Tibet. According to Hastings (Dictionary of Comparative 
Religions) the simplest method is carried out by the people themselves. Many laymen are 
equipped with a pocket divination manual (mé-pe) and the augury found by casting lots. 
This lot-casting can either be odds and evens (the random pouring of grain, pebbles or coins 
from a horn, cup, etc.) or dice on a sacred board or cards on which there are magic signs, 
or sheets or passages of scriptures drawn from a bow]. The reincarnation prediction is, it is 
said by Waddell,* usually carried out by a priest. The rebirth chart seen by the writer 
consists of 56 2 in. squares (8 x 7). Each square corresponds to a future state. A six-sided 
die with letters on it is thrown down on to the rebirth chart, and according to the 
square «- which it lands and the letter which falls uppermost so the priest predicts. 
Waddell, who visited Tibet as a member of a British Mission, obtained one of these charts 
and a die (c. 1893). He remarks: ‘The dice (sic!) accompanying my board seems to have 
been loaded so as to show up the letter Y, which gives a ghostly existence, and thus neces- 
sitates the performance of many expensive rites to counteract so undesirable a fate.’ 
Possibly a similar chicanery was practised in Roman times! It would seem a reasonable 
inference anyway that the mystery and awe which the religious ceremony would lend to 
the casting of lots for purposes of divination would prevent the thinking person from 
speculating too deeply about it. Any attempt to try to forecast the result of a throw could 
undoubtedly be interpreted as an attempt to forecast the action of the deity concerned, and 
such an act of impiety might be expected to bring ill luck in its train. In addition, as we 
have noted, « method for such forecasting could not easily be made owing to the imperfec- 
tions of most of the dice. On the other hand, it is possible that probabilities were known to 
the priests since the ceremonial dice are well made. 


13. Through the Dark Ages the Christian church appears to have carried on guerilla 
warfare against gaming with knucklebones and dice. The writers of the Renaissance make 
many references to bishops who write de aleatoribus or contra aleae ludum during the first 
fourteen hundred years of the Christian era. It is likely therefore that the bishops wished 
to get rid of the sortilege as a religious ceremony, and they succeeded to a certain extent in 
doing this, although divination by lot still survives to-day in the Moravian sect. What the 
bishops could not do was to stop men playing games of chance. There are several references 
in early French literature to gaming. The play of Jean Bodel, Le Jeu de Saint Nicolas, 
written c. A.D. 1200, has a scene where thieves are gambling in a tavern. They are playing 


* L. A. Waddell, The Buddhism of Tibet, W. Heffer and Sons Ltd. 1934 (2nd edition). 
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the dice game of * Le hasard’,* the rules of which were set down clearly by Pierre-Raymond 
Montmort in his book some five hundred years later. (Analys sur le jeux d’azard 1708, 


p- 113). Bodel’s play is interesting for the suggestion that the thieves knew how to 
manipulate the dice to produce a desired result. 


14. With the invention of printing (c. 1450) and its rapid development during the latter 
half of the fifteenth century the references to games of chance become more numerous, but 
there seems to be no suggestion of the calculation of probabilities. Thus we find in the 
writing of Frangois Rabelais—a man who might be expected to know the latest in games 
of chance as played in taverns—the following interesting passage: ‘Then they studied the 
Art of painting or carving; or brought into use the antic play of tables, as Leonicus hath 
written of it or as our good friend Lascaris playeth at it’+ (Gargantua and Pantagruel, 
Urquhart’s translation, Book 1, Chapter xxiv). Gargantua and Pantagruel was issued in 
sections at intervals between 1532 and 1552. The date of this reference will therefore be not 
long after 1532. 

The Leonicus of the reference is Nicolas Leonicus Tomeus, a professor of Greek and Latin 
at Padua who was born at Venice in 1456. He was well known for his learning and philo- 
sophical bent and acted as tutor to the English Cardinal Pole when as a young man he 
visited Italy. According to Erasmus he was ‘a man equally respectable for the purity of 
his morals and the profundity of his erudition’. His letters, which have been translated by 
Cardinal Gasquet, give an interesting picture of the life of an intellectual of that time. He 
died at Padua in 1531 and his collected works were printed at Basel in 1532. Rabelais is 
clearly referring to Leonicus’s treatise Sannutus, sive de ludo talario, a dialogue in the 
manner of Plato concerning the game of knucklebones (astragali). There is, however, little 
relevant to the calculus of probability in this work. The discussion turns on references to 
the game in Roman literature and a description and argument of the value of the various 
types of throw.{ 

A similar type of disquisition was written by Calcagnini about this time but possibly a 
little later than that of Leonicus. Celio Caleagnini was born at Ferrara in 1479 and died 
there in 1541. He was a poet, a philosopher and astronomer of repute ; his treatise Quomodo 
caelum stet, terra moveatur, vel de perenni motu terrae commentatio, in which he held that the 
earth moved round the sun, anticipated Galileo Galilei by some years, for Galileo was not 
born until 1564. The dissertation of Calcagnini entitled De talorum, tesserarum ac calculorum 
ludis ex more veterum is less philosophical in tone than that of Leonicus. It is of interest to 
probabilists only in that it was an influence over Cardano, who, from his several references, 
had clearly studied it closely. 

* According to the editor, F. J. Warne, of the text of the play, Le Jeu de Saint Nicolas, ‘hasart’ 
meant the throw of a certain number of points at dice, varying according to the game played. In 
present-day probability theory the meaning is of course much wider. 

+ Rabelais actually wrote ‘en usage l’anticque jeu des tables ainsi qu’en a escript Leonicus’. Duchat 
in the commentary on the 1741 edition says ‘Ce n’est point tables qu’il faut lire ici, comme dans toutes 
les Editions, mais tales’. Presumably Duchat (followed by later commentators) makes this correction 
because of the work of Leonicus referred to. It is just possible that Rabelais meant what he wrote and 


that he was referring to the ancient board game (from which the modern game of backgammon de- 
veloped) in which the ‘men’ may have been moved by throwing astragali, the counting of the throws 
being that described by Leonicus. 

t I have not been able to trace why Lascaris is mentioned by Rabelais. Andre-Jean Lascaris 
surnamed Rhyndaconus (1445-1535), a Greek scholar born in Phrygia, was Librarian to Frangois I. 
He rescued many Greek manuscripts from the Turks. Possibly he collected references to gaming in 
Greek literature much as Leonicus did for Roman? 
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15. We arrive at the sixteenth century, then, with a well-known humanist Leonicus, 
and a great astronomer Calcagnini, writing on games of chance with no attempt or reference 
to the calculation of a probability. (This does not mean of course that some calculations 
had not been made in a manuscript which we do not know about.) There were, moreover, 
other scholars and bishops writing on the same topic about this time, so that interest in the 
subject was keen. As far as we know at present it was left to Gerolamo Cardano to make 
the step forward. Cardano, the illegitimate son of a geometer, Fazio Cardano, was born in 
Pavia in 1501. His illegitimacy was a bar to his professional advancement on more than 
one occasion, and it is possible that the bitterness engendered by this fact was responsible 
for his not too scrupulous regard for the attribution of other scientists’ ideas. The crime of 
plagiarism was a common accusation among scientific workers of the sixteenth and 
seventeenth centuries, but of none was it raised more loudly than of Cardano who was 
strongly disliked by his contemporaries and despised by his successors. Until about the 
middle of the nineteenth century his biographers unite in regarding him as a charlatan; 
possibly at the present time the pendulum has swung too far the other way, and more is 
read into his writings than is justified. The truth would seem to lie somewhere between the 
extremes of charlatan and persecuted savant. 

Cardano was physician, philosopher, engineer, pure and applied mathematician, 
astrologer, eccentric, liar and gambler, but above all a gambler. He himself owns that on 
one occasion he sold his furniture and his wife’s possessions in order to get money to indulge 
his passion for gaming, and there is no doubt that this passion was on- «- the things which 
ruled him through his whole life. His chief interest professionally was medicine, but he 
interested himself also in the communication of spirits and the casting of horoscopes. He 
does not seem to have been too successful at this last, but he was not deterred from casting 
that of Jesus, a performance the impiety of which probably led to his imprisonment. Even 
allowing for the exaggerations of his biographers there seems to be no doubt that he was 
eccentric to the point of madness. This did not prevent him, however, from making con- 
tributions to pure mathematics, and it is to this combination of pure mathematician and 
gambler that we owe the Liber de Ludo Aleae. This treatise was found in manuscript in 
Cardano’s papers after his death in Rome in 1576, and was first published in his collected 
works in 1663 at Lyon. Cardano implies that it was written c. 1526; the exact date is not 
important since no question of priority or plagiarism is involved, but it is curious that a 
manuscript of this kind should have survived fifty years of his remarkably variegated 
career. 


16. The first complete translation of de Ludo Aleae into English is given in Cardano, the 
Gambling Scholar, by Oystein Ore, published in 1953. Ore remarks that the book is badly 
composed and that understanding of Cardano’s work has possibly been hindered by this. 
There are some, however, who will not agree with his commentary on the treatise and who 
may feel that as much prescience is now attributed to Cardano as there was before too little. 
The crux of Cardano’s work is to be found in the section entitled ‘On the cast of one die’ in 
Ore’s translation: 


The talus has four faces and thus also four points. But the die has six; in six casts each point 
should turn up once; but since some will be repeated, it follows that others will not turn up. The 
talus is represented as having flat surfaces, on each one of which it lies on its back;. . .and it does not 
have the form of a die. One half of the total number of faces always represents equality; thus the 
chances are equal that a given point will turn up in three throws, for the total circuit is completed 
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in six, or again that one of three given points will turn up in one throw. For example, I can as easily 
throw one, three or five as two, four or six. The wagers are therefore laid in accordance with this 
equality if the die is honest.... 


We have therefore the necessary abstraction made; if the die is honest, i.e. if we may give 
equal weight to each side, then we may calculate the chances. There is no doubt, I think, 
that Cardano was led to this conclusion empirically and his generalization of it is partially 
wrong. For he goes on to discuss casts of two dice and three dice giving tables which are 
correct if ‘the dice be honest’. When we come, however, to the section ‘On play with 
knucklebones’, it seems that he falls into error. The knucklebones (or astragali) have four 
sides. The different combinations of numbers which may arise in the throwing of four 
astragali are correctly enumerated, but the chances are calculated under the assumption that 
all sides are equi-probable; which they are not. Possibly Cardano had never played with 
astragali, for it is likely that if he had he would have noticed that to assume the sides of the 
astragalus had equal weight in his enumeration of alternatives was not adequate. But this 
fumbling suggests that he was not quite clear in his own mind about what he was proposing. 

I do not think that the fact that Cardano did not quite see the mathematical abstraction 
clearly can detract from the fact that he did, on paper at any rate, as far as we know, 
calculate the first probability by theoretical argument, and in so doing he is the real begettor 
of modern probability theory. The claims of his biographer that he anticipated the law of 
large numbers, etc., may not be acceptable; it would appear that Cardano was judging 
from his experience rather than his algebra. 


17. It would be strange if Cardano, following the mode of his age, did not communicate 
some of his thoughts about gaming to his pupils. Fear of being accused of plagiarism, fear 
of being plagiarized, may have kept him silent, but the whole tone of his treatise is a 
practical one; practical advice about playing, laying odds and so on make up a large portion 
of it. He would therefore almost certainly have discussed its contents with his friends, 
particularly if he thought about it over as long a period of time as he suggests. The fact that 
de Ludo Aleae did not appear in print until 1663 does not therefore seem to be a reason 
why Cardano’s ideas should not have been common knowledge to scholars in Italy after his 
death, and the way in which Galileo-Galilei plunges into his discussion of dice playing, 
without much preamble, tends to lend colour to this. 

Galileo-Galilei was born in Pisa in 1564, the son of Vincent Galilei, a musicographer well 
known in his day. He died in 1642 at Arcetri after a career as full of achievement as any 
that has ever been known. His contributions to science, both as astronomer and as mathe- 
matician, are striking for their originality of thought and clarity of purpose. Why this 
prince of scholars has never received the full recognition which is his due it is difficult to 
say. It is thought by some modern writers that his sensible recantation of the earth’s 
movement, after physical torture at the hands of the Inquisition at the age of 70, has caused 
a revulsion to him among the scientists of later years. This is probably not so; what is more 
likely is that the envious fellow-scholars who delivered him to the Inquisition conspired 
after his death to belittle the work which he had done. In this they were possibly helped 
by Galileo’s literary style which is noteworthy for clarity but not brevity, being in fact 
prolix and tedious in the extreme; no : is left undotted, no ¢ is left uncrossed.* 

* E. 8S. Pearson suggests to me that this prolixity was one which Galileo shared with many other 


Renaissance writers, and that it arose from the struggle which the early mathematicians must have 
had to formulate mathematical abstractions on paper. I think that this may well be so. 
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This being so if there was any doubt about the general method of procedure in calculating 
chances with a die we should have had a long disquisition on the subject. However, in 
Sopra le Scoperte de i Dadi* he plunges straight away into his argument. The problem} is 
one already touched on by Cardano. Three dice are thrown. Although there are the same 
number of three partitions of 9 as there are of 10, yet the probability of achieving 9 in 
practice is less than that of throwing 10. Why is this? I quote a little from E. H. Thorne’s 
translation of this note. The note begins: 


The fact that in a dice game certain numbers are more advantageous than others has a very 
obvious reason, i.e. that some are more easily and more frequently made than others, which depends 
on their being able to be made up with more variety of numbers. Thus a 3 and an 18, which are 
throws which can only be made in one way with 3 numbers (that is, the latter with 6, 6, 6 and the 
former with 1, 1, 1, and in no other way) are more difficult to make than, for exemple, 6 or 7, which 
can be made up in several ways, that is a 6 with 1, 2, 3 and with 2, 2, 2 and with 1, 1,4 anda 7 with 
1, 1,5; 1, 2,4; 1, 3, 3; 2, 2,3. Again, although 9 and 12 can be made up in as many ways as 10 and 11 
and therefore they are usually considered as being of equal utility to these, nevertheless it is known 
that long observation has made dice players consider 10 and 11 to be more advantageous than 9 
and 12. 


This extract serves to show how he begins the topic assuming that the calculations are 
known; it also serves to illustrate the prolixity of Galileo’s style. After some discussion of 
the six 3 partitions of 9 and of 10, he goes on: 


Since a die has six faces and when thrown it can equally well fall on any one of these, only six 
throws can be made with it, each different from all the others. But if together with the first die we 
threw a second, which has also six faces, we can make 36 throws each different from all the others, 
since each face of the first die can be combined with each of the second.... 


After saying that the total number of possible throws with three dice are 216, he gives a 
table of the number of possible throws for a total of 10, 9, 8, 7, 6, 5, 4, 3, noting that the 
numbers 11-18 inclusive are symmetrical with these. Thus the number of possible throws 
for 10 is 27 and 25 for 9. His treatment of the problem is exactly that which we should use 
to-day and leaves us in no doubt that the calculation of a probability from the mathematical 
concept of the equi-probable sides of the die was clearly known to the sixteenth-century 
mathematicians of Italy. We can marvel at the person asking Galileo the question; he 
obviously gambled sufficiently to be able to detect a difference in empirical probabilities 
of 1/108. 


18. Galileo’s collected works were first published in Bologna in 1656, but this fragment 
on gambling was not included. It does appear in the more complete collection published 
at Florence in 1718. Since, however, Galileo thought the problem of little interest, for he 
did not pursue it, there seems to be no reason why he should have made a secret of it, and 
following the custom of his day he probably instructed his pupils. At any rate it is evident 
that the mathematical probability set was no stranger to the French mathematicians of 
the seventeenth century, as is witnessed by the now famous correspondence between 
Pascal and Fermat in 1654. The first letter of the series, from Pascal to Fermat, setting out 
the problem of points is missing. We have, however, Fermat’s reply to it, and the subsequent 


* This is Galileo’s own title. Considerazione sopra il Giuco dei Dadi, a later title, appears first in the 
collected works of 1718. 

+ Like Pascal sometime later, Galileo wrote to answer a problem put to him by a gambler. 

¢ M. G. Kendall points out to me that the problem posed by the Chevalier de Méré to Pascal con- 
cerning the problem of points involved similar small probabilities. 
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follow-up,* and from the way in which Fermat writes it seems clear that the actual defini- 
tion of probability is assumed known. What the two savants were interested in was the 
application of this definition to specific problems which were concerned with dice playing 
between gamblers of equal skill and opportunity. The approach to the problems is similar 
to that of Galileo, and the generalizations which are made from the particular cases dis- 
cussed are not well supported. 

It is true that Galileo wrote on one problem only and fairly briefly at that, but it is difficult 
to see why Pascal and Fermat should be preferred as the originators of probability theory 
before Galileo or Cardano. It may well be that the precocity of Pascal as a mathematician 
led to much of his work being accepted with acclamation, and certainly without its priority 
being questioned. We find, for example, the famous Arithmetic Triangle in Stifel’s Arith- 
metic (1543), in the General Tratato of Tartaglia in 1556, in the Arithmetic of Simon Steven 
of Bruges (Leiden, 1625). It is possible that Pascal may not have known of these writers. 
However, he certainly knew of Pierre Herigone’s Cowrs Mathematique (Paris, 1634), since 
he makes several references to it in his own Usage du Triangle Arithmétique pour trouver les 
puissances des bindmes et Apotémes. Herigone uses a table of numbers analogous to the 
Arithmetic Triangle to find binomal coefficients. Perhaps this same aura which dazzled 


Pascal’s contemporaries (and at the same time caused them to overlook some of Fermat’s 
work) still blinds us to-day. 


19. If we take the origins for granted and look at developments of the theory, then by 
far the greatest impetus to theory during the years 1650-60 must have come from the 
publication of De Ratiociniis in Aleae Ludo by Christian Huygens. Huygens as a young 
man of 26 arrived in Paris in July 1655 on the equivalent of the English ‘Grand Tour’. 
He did not meet Pascal, Fermat or Carcavi, the intimate friend of Pascal, but he did meet 
Roberval, professor of mathematics at the Collége Royal de France, who is mentioned by 
Pascal as having been also approached by the Chevalier de Méré. Huygens stayed in Paris 
from July to November, and after his return to Holland he began a correspondence with 
both Carcavi and Fermat which lasted over a period of years. The young man’s imagination 
was obviously fired by the discussions he had in Paris, and his mathematical ambitions 
stimulated by the immense activity of the group which some ten years later (1665) was to 
found the Académie des Sciences. He set himself to work, and in March 1656 he wrote to 
Prof. van Schooten that he had prepared a manuscript about dice games. Francis Schooten 
was professor of mathematics at Leyden and had been Huygens’s teacher. He took the 
young Huygens’s manuscript (which was written in his native language), translated it into 
Latin and published it as an appendix to his Exercitationes Mathematicae in 1657. (A French 
translation of this appendix can be found in Oewvres de Huygens, tome 14, on ‘Calcul des 
Probabilités’ published by La Société Hollandaise des Sciences in 1920.) In this T'ractatus 
de Ratiociniis in Aleae Ludo Huygens sets out in a systematic manner what he must have 
learnt in Paris and adds some results which he may have achieved himself. 

In the letter to Francis Schooten he writes 


. ..quelques-uns des plus Célébres Mathématiciens de toute la France se sont occupés de ce genre 
de Calcul, afin que personne ne m’attribue l’honneur de la premiére Invention qui ne m’appartient 


* It is interesting to see Pascal fall into the same kind of trap which caused D’Alembert such 
controversy. In discussing the game of heads and tails and the tossing of a coin D’Alembert argued 
that the probability of throwing a head with two tosses of a coin was 2/3. For we may have T7, TH 
or H—when we stop, the second throw being immaterial, since we have achieved what we want. 
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pas. Mais ces savants. ..ont cependant cachés leurs méthodes. J’ai donc di examiner et approfondir 
moi-méme toute cette matiére 4 commencer par les éléments, et il m’est impossible pour la raison 
que je viens de mentionner d’affirmer que nous sommes partis d’un méme premier principe. ... 





Accordingly Huygens begins by proving his basic propositions, deals at some length with 
the problem of points and then passes on to dice playing. His last proposition (XIV) has 
a familiar ring: 


If another gambler and I throw 2 dice turn and turn about with the condition that I will have ) 
won when I throw 7 points and he will have won when he throws 6, if I allow him to throw first, ' 
find my chance and his of winning. : 


His delineation of his fourteen propositions is admirably clear and concise, and it is no 

marvel that the tract was used by mathematicians as a reference book up to the time of 

James Bernoulli (who reprinted it) and beyond. Possibly by this crystallization of the 
ideas of the French mathematicians Huygens has earned the right to be regarded as the 

father of the probability theory. 
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20. After Huygens the interest of probabilists was uot solely in gaming, although this 


interest did not die away for another hundred years or so. But with Huygens the new 
calculus seems fairly launched, and this is therefore a suitable point to make a break. 
There are many questions which one leaves unanswered. The drawings and paintings by 
palaeolithic man of himself are very rare, and there is probably no hope of finding pictures 
of his recreations. If he prized the astragalus as a toy it seemed a possibility that he might 
have carved or decorated it in some way, but I have not been able to find any record of 
this. But while we cannot pull aside the curtain from four hundred centuries the possibility 
does exist that the pre-historians may be able, one day, to take us back a little farther than 
the third millennium. The farther back one goes the more fragmentary the evidence, but 
the earliest dice found are described as being of ‘ well-fired buff pottery’, and they certainly 
would not have been the first made. 

The tantalizing period to the present writer is the period from the invention of printing 
to A.D. 1600. In this period we have two mathematicians only calculating probabilities, 
and yet this was in the immense intellectual ferment of the Italian Renaissance. It seems 
hardly possible that there were not other natural philosophers who attempted similar 
calculations, but such documents, if they exist, will only now come to light by chance. 

The correspondence between the French mathematicians of the first half of the seventeenth 
century is almost complete, and presumably the possibility does exist here of finding further 
letters. They all seemed at one time or another to send letters to one friend under cover of 
letters to another, and such letters may conceivably still be ascribed to the wrong person. 
However, enough information does exist regarding the seventeenth-century mathematicians 
to make a coherent study, and if I appear to have done them scant justice it is because 
I find the period so interesting that I hope to write about it more fully on another occasion 
elsewhere. 


Collecting information about dicing and gaming has been a hobby of mine for some time, 
and the list of persons who have drawn my attention to one aspect or another of it is 
formidable. I want to thank Prof. B. Ashmole of the British Museum who allowed me 
critically to examine the dice of the classical period which are in his care and M. Jean 
Charbonneaux of the Musée de Louvre who did me the same service. To Prof. C. M. Robert- 
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son of my own college I owe not only the privilege of tessing many dice but many stimu- 
lating discussions and useful references. A. J. Arkell allowed me to examine the dice 
brought by Prof. Sir Flinders Petrie from Egypt and to photograph various gaming boards 
not reproduced here. The breadth of knowledge and wide reading of Miss M. 8S. Drower 
have acquainted me with many Egyptian board games which provide a fascinating puzzle 
for those interested in deducing how they are played. Miss J. Lowe and R. Graves drew 
my attention to various references in classical literature. The illustration of the Hounds 
and Jackals game is by the courtesy of the Metropolitan Museum of Art of New York. 

I want to thank Dr J. H. Willis who translated the Sive de ludo talario of Nicolas 
Leonicus for me, Dr E. H. Thorne who supplied a translation of Galileo’s letter on dice, 
Prof. B. Woledge who drew my attention to early French plays and Miss J. Townend 
who drew Text-fig. 1. Miss J. Pearson and Miss J. Edmiston helped me to find many 
references and A. Munday and Miss A. Lodge helped with photographs. The manuscript 
as a whole owes much to the keen critical faculties of Prof. E. S. Pearson and Prof. 
M. G. Kendall. Part of this work was carried out with the aid of a grant from the 
Central Research Fund of the University of London. 
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SOME FEATURES OF THE GENERATION TIMES 
OF INDIVIDUAL BACTERIA 


By E. 0. POWELL 
Microbiological Research Department, Experimental Station, Porton 


INTRODUCTION 


The dynamics of bacterial growth in the mass has been the subject of a great deal of experi- 
mental study, but quantitative observations on the behaviour of individuals are scanty. 
In spite of the primary need for detailed knowledge of normal growth processes, it is the 
abnormal and morbid that have received most attention from both experimenters and 
theoreticians. Kendall (1952a) and Finney & Martin (1951) have recognized the necessity 
for extended studies of the normal condition; in this they have been prompted by the 
existence of two hypotheses (Kendall, 1948, 1952a; Rahn, 1932), both of which connect 
the observed scatter of generation times with an inner mechanism of fission. 

Measurements of individual generation times sufficiently systematic and detailed to 
make analysis worth while have been provided only by Kelly & Rahn (1932) so far as I know. 
My object here is, first, to add to the existing corpus of data; secondly, to show that for 
technical reasons, the experimental work so far done (including my own) is not really 
adequate to test the hypotheses that have been proposed; thirdly, to suggest further lines 
of study. 

According to Rahn’s hypothesis, the fission of a bacterium involves the simultaneous 
duplication of a number of essential entities or structures which may be the genes. The 
time required for duplication will not be the same for each, but will be subject to a law of 
‘chance’, and fission can only take place when every entity has been duplicated. Under 
certain assumptions, Rahn shows that the observed generation times should then be 
scattered according to the law 

aF = res m(] —e-tim)o—l dr, (1) 
m 
where dF is the proportion of generation times in the range 7 to 7+dr7. The parameter g is 
equal to the number of entities which have to be duplicated, and m is a parameter depending 
on the rate of the individual processes. In what follows, I call the frequency function (1) 
‘Yule’s distribution’ (since Yule (1925), was the first to describe it). 

Kendall’s reasoning is similar, except that he supposes the events leading to fission to 
take place step by step, and fission to occur only when the series of steps is complete. Each 
step may or may not be a process of duplication. The observed generation times should then 
be scattered according to a Pearson Type III law in the form 


(g) #. 


where g is the number of steps and m is a kinetic parameter as before. 
The idea of a distribution of generation times implies in practice (though this is not 
strictly necessary) some degree of permanence in the properties of the population to which it 
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applies. At a given instant, the organisms in a bacterial culture certainly possess a definite 
age distribution, but this has no significance if the growth rate and cultural conditions have 
varied during the life span of the oldest extant organism. A fortiori, a distribution of genera- 
tion times, even if it be formally defined at an instant, will be a complex and imperspicuous 
character of its population unless the factors controlling it remain constant over at least 
several generations. Conditions adequate to the precise definition and determination of 
a distribution probably exist only, if at all, in a continuous culture maintained in a steady 
state. No measurements have yet been made on such a system, and I have assumed here 
(as has been done implicitly in the past) that during the phase of logarithmic growth in 
static culture the succession of generations proceeds in a regular manner. I give, however, 
two examples to the contrary. 

It is usually found that the mass growth rate (the logarithmic derivate of the mass of 
organisms per unit volume) in a static culture rapidly accelerates at the end of the lag phase 
to a steady value which is maintained for several hours. The number growth rate (the 
logarithmic derivate of the number of organisms per unit volume) often behaves less regu- 
larly. It may be small or zero even after the mass growth rate has reached its maximum 
value; abnormally long organisms are thereby formed. There follows a stage of acceleration 
during which the number of growth rate approaches and temporarily exceeds the mass 
growth rate. The two then become equal, in some cases only for a short time before the phase 
of decline sets in. These remarks apply to culture in a liquid medium. Detailed quantitative 
information about the growth of colonies on solid media over long periods (i.e. of many 
generations) appears to be lacking, and it is only on solid media that observations of in- 
dividual generation times can be made. With these difficulties in mind, it is clear that a 
satisfactory experimental procedure is not easily attained. Observations which are to be 
combined so as to furnish a distribution curve must either be confined to a brief period in 
the life of each culture examined, or, if extended, they must be shown to be drawn from a 
sensibly unchanging population. 

In a culture whose number growth rate is varying there is temporarily an excess or 
deficit of young organisms; a change in the growth rate implies a change in generation time 
distribution. It does not necessarily imply a change in all the parameters of that distribu- 
tion, but even if some one were known or assumed to be constant, it would be a matter of 
considerable practical difficulty to extract its value from a set of data derived from a 
changing population. At present it is not possible to test or to make use of Kendall’s (1948) 
relation between the coefficients of variation of clone size and generation time; a method of 
maintaining a constant growth rate for many generations must first be established. 

The results which I have to present consist of measurements of generation times carried 
out on six species of organisms (Bacterium aerogenes, strain N.C.T.C. 8197; Bact. coli 
anaerogenes, strain N.C.T.C. 4450; Streptococcus faecalis; Proteus vulgaris, strain LC; Bacillus 
mycoides, strain SR 2; B. subtilis, strain UP 1). Their distributions are compared with the 
frequency functions suggested by Kendall and Rahn. A partial analysis and discussion of 
some other features of the nexus of generation times are also given. I use the words 
‘mother’, ‘daughter’, ‘sister’ with an obvious extension of their usual meaning, and the 
neutral terms ‘inception’, ‘termination’ to denote respectively the events at which an 
organism becomes a recognizably separate entity (by fission of its mother) and by which it 
ceases to be so (by itself dividing). The same terms may also refer to the epochs of these 


events. In describing the morphological changes in the cell wall occurring at fission I replace 
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much more accurately since inception and termination took place in close juxtaposition 
both temporally and spatially ; records of generation times of 2, 4 and 6 min. may be taken 
as correct to + 1 min., and therefore also correct as to their frequency. As it turns out, this 
is an important consideration. 

The uncertainty introduces an error of a special kind. A generation time which should 
be recorded as 7 may be recorded as 7 — 2 or 7 + 2 for instance; it will then appear in a fre- 
quency group adjacent to the correct one, and so there will be correlated errors in the 
frequencies. The effect will increase the variance, probably to a quite negligible extent, but 
there may also be a concomitant error in the y? between the observations and a fitted func- 
tion, making the fit seem better or worse than it is. I have found no way of estimating or 
allowing for this aberration. 

I wish to make it quite clear what is here meant by a measured generation time 7: The 
fission of an organism is first observed to be complete, as nearly as can be judged, at a time 
t,, say. It has therefore divided at some time between ¢,—2 and ¢,. Similarly, one of its 
daughters terminates at some time between f, — 2 and f,. Then the generation time of that 
daughter is recorded as ¢t, —t,, which is always a multiple of 2. Thus 7 is the mean value of 
the generation time of the group of organisms to which it is ascribed. I make this simple 
point because a careful study of Kelly & Rahn’s (1932), Rahn’s (1932) and Finney & 
Martin’s (1951) papers raises doubts as to what Kelly & Rahn have actually recorded. 
Their figs. 1 and 2 suggest that they proceeded as I have done (though with a 5 min. instead 
of a 2 min. interval between observations), but in their tables the observations are grouped 
into ranges 0-5, 5-10, 10-15, ete. Rahn and Finney & Martin have the misleading phrase 
‘fissions ... observed in successive five-minute intervals’. Finney & Martin assumed that the 
mean generation times for these intervals were 24, 74, 124, ...; they may have been 
24, 5, 10, 15, ... or 5, 10, 15, 20, .... I have verified if either of these alternatives is chosen, 
the goodness of fit of Yule’s distribution is about the same as that obtained by Finney & 
Martin (xj = 14-9), but the corresponding estimates of g are widely and significantly different : 
18-2 and 36-2 respectively, as against 26-0. 

Other cbservational arrangements are possible. For instance, the distribution of genera- 
tion times can be deduced from the age distribution in a culture. If ¢(a) is the frequency 
function of age and f(r) that of generation time, it is found that if the culture is growing 


steadily J 
o(a) = (oye [ f(t) dr (4) 


(ef. Harris, 1951). Unfortunately, the form of the curve ¢(a) is dominated by the value of k; 
it is very insensitive to changes in the parameters of f(r) that leave k the same. Experi- 
mentally, the determination of an age distribution at a given time requires a study of the 
previous history of the culture over a period long enough to give the generation time dis- 
tribution directly. I have, however, made use of the relation (4) in the reverse sense, as a 
check on the regularity of growth in experiments with B. mycoides and B. subtilis. 

Again, suppose that a number of organisms extant at the epoch ¢ = 0 are watched until 
they divide. The times ¢ = @ of fission will be distributed with a frequency function 


(0) eX? % e—* f(r) drdé. 
J0 


This gives another possible method of determining f(r); I have made no use of it. 
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(6) Practical details 

With one exception, organisms were grown on cellophane over tryptic meat broth in the 
culture chamber described by Harris & Powell (1951). The medium was constantly circulated 
so that an approximation to a condition of continuous culture obtained; the volume of 
medium (100c.c.) was so large relative to the inoculum used (c. 100 organisms) that no 
appreciable change could occur during the course of an experiment, either by depletion of 
nutrients or accumulation of products of metabolism. In this respect the technique is 
superior to that of Orskov (1922) adopted by Kelly & Rahn. On the other hand, times of 
fission can be judged with less precision in cellophane culture; the average error appeared 
to be about + 2 min. However, this contribution to the total variance is only a small fraction 
of that due to the organisms themselves. 

The cultures were incubated throughout at a temperature of 35 + 0-5°C. The two Bacillus 
species were inoculated as spores, the rest as saline dilutions from a 20hr. liquid culture. 
To avoid confusion, the organisms on which observations were to be made were separated 
from their neighbours by means of a micromanipulator so that each lay within a clear area 
of cellophane. They were exposed to light as little as possible. Evaporation from the surface 
of the cellophane was controlled so that each organism or group of organisms was sur- 
rounded by a visible fillet of liquid (Harris & Powell, 1951). The evaporation was tem- 
porarily increased only when necessary to verify the occurrence of fission. A photographic 
method of recording was rejected after trial as less informative than direct examination, 
and no less tedious when the processing and assessment of films were taken into account. 

No difficulty was encountered in working with Bact. aerogenes and Bact. coli anaerogenes. 
A typical experiment consisted in following the development of four organisms simul- 
taneously for two or three complete generations. Organisms were selected for observation 
always at about 2hr. from inoculation, previous experience having shown that erratic 
behaviour at the end of the lag phase would by then have subsided. 

Strep. faecalis was found to be very sensitive to light; a few minutes exposure to the usual 
conditions of illumination sufficed to stop growth. The effect was minimized by adding 
5% of sheep serum to the medium and illuminating with red light as dim as could be 
tolerated. As a further precaution, observation was confined to one complete generation 
only of the progeny of the selected organisms. The numerical results suggest that these 
measures were effective in securing uniform unrestricted growth over the short period 
necessary. Observation was begun at 2} hr. from inoculation. 

The mode of fission of Pr. vulgaris in cellophane culture was erratic, and seemed sometimes 
to be intermediate in character between the septate and the isthmoid. The first signs of 
a waist were apparently accompanied by the formation of a septum, and the organisms would 
remain for several minutes in this state of incipient fission. Attempts to assess generation 
times were unsatisfactory, and this species was therefore studied in culture on an agar- 
tryptic meat medium under incident dark-field illumination (Pearce & Powell, 1951). 
Judging by the lengths of organisms as then seen, the uncompleted fissions observed by the 
vertical illumination on cellophane were recorded as complete fissions under the dark field. 
For this organism, then, the conditions of culture were similar to those in Kelly & Rahn’s 
experiments. For 3}-4hr. from inoculation it appeared to grow and reproduce steadily, 
but then the rate of fission declined, longer motile organisms were formed and swarming 
began. In order not to encroach on this phase of changing growth rate, observations were 
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again confined to one generation only, beginning at 2} hr. from inoculation. (This organism 
did not swarm on cellophane over a circulating medium, presumably because no local 
concentration of metabolic products could be built up. See Lominski & Lendrum, 1947.) 

Observations on B. mycoides were carried out during the 80 min. beginning at 2} hr. from 
inoculation. By that time the spore coat had been shed and families of two to four organisms 
developed. Three groups of experiments were conducted with B. subtilis: series (i), observa- 
tion begun at 2} hr.; series (ii), observation begun at 44 hr.; series (iii), organisms growing 
over defibrinated sheep blood diluted with its own volume of normal saline; observation 
begun at 3hr. In order to check the regularity of growth of these two species, the ages of the 
organisms extant at the end of each experiment were recorded, so that their distributions 
couid be compared with those calculated by means of (4) from the generation time 
distributions. 


THE FREQUENCY FUNCTIONS 


I give in this section a statement of the frequency functions which are made use of below, 
together with their relevant properties. Each is limited to positive values of the variate, 
and each involves a parameter g which, whatever its interpretation, is a measure of intrinsic 
dispersion, i.e. a function of the coefficient of variation only. 

Fitting was carried out either by the method of maximum likelihood or by that of moments 
(sometimes both). In several of the examples the distributions are such that the method of 
moments is of low efficiency (cf. Fisher, 1922). However, I do not think that the results are 
of much more than comparative value in any case, and the points I shall have to make are 
not so nice as to call for great statistical refinement. 


(a) The Pearson T'yve III distribution (Kendall’s hypothesis) 


pale Soe: 


This is a well-known function. The cumulants are 


k, = m'gI(r), 
whence the first three moments 


My = MY, fg= mg, py = 2m*g. 
The coefficient of variation is 


Cy = g-* 
and the skewness and kurtosis 


a. K/K} =2¢7, = K4/K3 = 6/9. 


Both tend to zero for large g. 
Maximum-likelihood estimates 9, m of g and m are given by 


Lfr/n—gm = 0, 


Df log r/n — log m— (9) = 0, 


where the f are observed frequencies of the generation times 7, is Uf and y(9) is 


dlog I'(g)/dg. 





the 
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(6) Yule’s distribution (Rahn’s hypothesis) 
dF = oem (l—e-7/™)o-1 dr. 


Except for ;, the moments are difficult to calculate and are best obtained through the 
cumulants (cf. Daniels, 1952; but see Yule, 1925). The cumulant generating function is 
easily found to be 

K(t) = log '(g + 1) + log ['(1 — ttm) — log ['(g + 1—1ttm), 
whence kK, = (—1y m{P,(1)—Ty(g+ )}, 
where for uniformity I write 


T(x) = (d/dx) {flog I(x)}, T(x) =y(z). 

In particular, My = kK, = MT (9+ 1)—-T(1)}, (5) 
He = Kg = mMAT(1)—T4(g + V}, (6) 

, _ @)-Pyo+ 0} 

° fet )=F,Q) ! 

nape P(g +1)—T,(1) 

* {P,0)-Pg+1)}! 
The curve is distinguished from a Type III having the same first two moments by its 


slower rise near zero, and by its greater steepness on the left flank. In fact, for increasing g, 
y, diminishes but tends to a still rather large non-zero limit 








—n-6tT,(1) = 1-14. 


In practice y, scarcely differs from 2-4 (= 3!(7*/90) (6/7*)?). 

Some other features are discussed by Finney & Martin (1951) and Rahn (1932), who note 
the insensitivity of the form of the curve to changes in g. 

The maximum-likelihood equations for fitting are 


a Xflog (1 —e-tlm) = 0, 
g 
aA a 1 A 
<= Sfp — 2 + ad 
m m  erm—} 
The solution is greatly facilitated by the tables of Einstein functions due to Sherman & 
Ewell (1942). 


(c) The Pearson Type III distribution with allowance for bias 


From equation (3), if the true distribution is of Pearson Type III, that derived from data 
which are biased in the way already described will be 
aF = Ba(r) mtg) (7) 
where «(7') is given by equation (2). This is a frequency function in the range 0<7<T 
(7 = 80 in all experiments); the moments strictly involve incomplete gamma functions, but 
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these are here so nearly unity that no appreciable error arises in taking the range of integra- 
tion as infinite. And then it is found that 


ekT (1 +km)-?—1)\-4 
B=| ek? —] P 








ak. ekT (1 | 
Ha = 9 \ RT (1 + kem)-9— 1 J” 


ekT(1+km)-9-?—-1 
, = 2 





with k = (log 2)/(mg). The parameters are hence obtained by iteration; approximate values 
of m and g are inserted into the expressions in curly brackets, and the two equations for 
the moments solved to give a closer approximation which is used as the starting point for 
a second cycle, and so on. For the terminus a quo, it is enough to take m = 1, and 
7 = gm = 25min. (a value roughly known a priori). 


(d) Yule’s distribution with allowance for bias 
dF = Baer)  ¢-im (1—e-*/m)o— dr. (8) 


With the same approximation as in the previous paragraph, 


B = (eT -1)/H, 
a T,(1+ bm +9)—U,(1+ km) _ is 
My = m{T(1)-T(g + 1) + (P(g + 1)—T(1)}*) 
eee ee a eae eral] 
P,(1)—-T(gt+ 1) +{Pigt+ )—-T)P 








x [H — iy", 
where k = (log 2)/m{T',(g + 1) —T(1)}, 
_ per gt 1) Tl +km) 
i= '(g+1+km) 


After fitting the function (7) to a set of data, estimates of the true moments were calculated 
from its parameters, and were used to obtain values of the parameters of (8) by means of 
(5) and (6). These values were taken as the first approximation in fitting (8), iteration being 
carried out as in the previous section. 


ANALYSIS OF THE DATA 


General 


The fission of a micro-organism, as usually understood, consists in its separation into two 
geometrically distinct pieces, each with a closed cell wall separating it from its surroundings. 
The closure of the wall follows the division of the cytoplasm at some unknown interval, but 
in unicellular organisms there is at least a unique one-to-one correspondence between the 
true cell division and the observed fission. In the genus Bacillus this is not so; the organisms 
are multicellular, and at any time contain party walls of various degrees of completeness. 














E. O. PowELL 25 


The party walls may or may not be laid down as double layers, but they cannot be seen to 
be so, and it is not known that each becomes necessarily a site of fission; in these organisms 
fission is a tertiary rather than a secondary process (for recent work on the structure of the 
party wall, see Dawson & Stern, 1954). Thus the generation times of B. mycoides and 
B. subtilis have not the significance for the hypotheses of Rahn and Kendall that those of 
the other organisms have. (The same applies to Kelly & Rahn’s measurements on B. cereus; 
Rahn recognized this.) There is no a priori reason why either hypothesis should not apply, 
as regards its mathematical form, to the Bacilli, but the ‘genetic’ parameter g cannot then 
be interpreted as the number of genes or primitive steps determining fission. For the 
moment, little more than a descriptive treatment of the generation times of these organisms 
can be usefully attempted in this context. 

The raw data are set out in Table 1, together with the first two moments and the skewness 
and kurtosis of each distribution. The range of generation times is very great, and much at 
variance with the vague assumption often made, that bacteria divide rather regularly. 
A striking feature in all the experiments was the highly erratic behaviour of small families 
of organisms (e.g. the six or fourteen constituting the progeny, up to two or three genera- 
tions, of a single ancestor); in some the generation times are quite uniform, in others widely 
dispersed. This unforeseen lack of homogeneity makes statistical analysis difficult, and 
renders doubtful the significance of any broad and simple treatment. Such quantitative 
analysis as I have been able to apply can only be a makeshift; but a more refined investiga- 
tion will not be justified until a larger range of data has accumulated. 

In the paragraphs which follow, some forward references are unavoidable; homogeneity 
is considered first because it affects the methods adopted in assessing the parameter g, 
but the results of curve-fitting are also relevant, and have been used to modify the tests of 
homogeneity. 

Homogeneity of the data 

Because of the special difficulties associated with B. mycoides and B. subtilis, only the 
four unicellular organisms Bact. aerogenes, Bact. coli anaerogenes, Strep. faecalis and 
Pr. vulgaris will be considered in this section. 

Kendall (1948) by applying Bartlett’s (1937) test for homogeneity of variance, showed 
that the estimates of his g obtained from the several experiments of Kelly & Rahn were 
highly inconsistent; and later (1952) he expressed doubts as to the propriety of combining 
the data to obtain an overall estimate of g. Such doubts are of course justified if the source 
of variation is a change in the experimertal conditions, and certainly Kelly & Rahn’s 
results strongly suggest that day-to-day fluctuations are considerable. 

Any one experiment of the present series yields measurements of the generation times of 
organisms belonging to several families, each the progeny of a single organism selected at 
the beginning of the experiment. The organisms all lie within a single field of the microscope 
and are subjected therefore to nearly identical environmental conditions. The variance of 
generation times within the experiment must be that proper to the organism under those 
conditions, and the coefficient of variation (a constant for each species by hypothesis) 
should yield an unbiased measure of g. . 

Now it remains true of the present results, as well as of Kelly & Rahn’s, that there is 
apparent lack of homogeneity between experiments. An examination of the coefficients of 
variation showed them to be fairly uniformly distributed, except for a few which were out- 
standingly large; the principal contribution to the variance in these cases was provided by 
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Table 1. Frequency of generation times, with the first (u;, =7) and second (12) 
moments and the measures Y,, V2 of skewness and kurtosis 
| | | 
Generation Bact. | Bact.coli| Strep. | Pr. | B. B. subtilis | B. subtilis | B. subtilis 
times aerogenes anaerogenes| faecalis | vulgaris | mycoides | series (i) | series (ii) | series (iii) 
ocana alas # Ula araalideh mae od} vada 
| | | | | 
: é ul snp gy desi ew Abdt Hee 8 0a wt onpaT {sii bacegooer alas 
2 i a a 2 | 4 | 1 1 
4 ee re ee Gn Bod 6 | 2 2 
6 2 Pie hk. 3,.| 4 6 | oneal 2 3 
8 Sil The Waranncteraie Reurae 10 | 15 | 6 2 
10 ‘Wet e 5) 0 2 ee Lente ede 3 
12 09 atibc@eows | Fy 5 16 of ease otf clayey at] 6 
14 35 45 | 4 9 13 4 | m1 
16 48 49 | 6 7 20 $3 de M, s.1..0087 
18 bet 13 20 25 — | -_ 35 
20 72), 7047 8 34 2 420i PUiogE Rope gg 
22 6. |. 68 1K 8 BT iho xBbde po) Broaqor 
24 |. ee es “The Sr 98.3) 14 Roo rd olu@f 6 ld enc 
26 42 an 41 a ES, a 
28 29 | #17 12 29 Ig O*) |toowggolay alitge .Domiscag, 
30 9 | 16 ll 34 Oc nofion Bs 23 | 28 
32 13 ~ 8 26 4 | bb 19. ..|.:,18 
34 % axe 3 4 25 Set. 24 14 
36 4 | 4 3 17 phot 8 2 
38 Babi 7 0 2) ocpos -pplic] 1 1a¢ of lag 
40 1 0 x4 0 a8 old Ms 3i| ed x8 Miobidnoo U 
42 2 a1) l 1 4 | ae ee 9 
44 0  " 0 4 2eT: — pike 1 
46 2 — 0 3 4 _ yans) 5 
48 _— — 1 1 l _ 1 1 
50 -— — 0 l 1 | — 2 3 
52 at. os $8 payne 0 0 = ig a 2 0 
54 — — ~~ 2 0 2310 | poiny 1 
56 — = - Riood 1 ; oo— | ot 1 
58 — — — | B.S 0 or | 1 0 
60 ie eee 7, 7 0 — a | 2 
62 ge A fio stasminhgxadshenbe sib reef Boagetda y sid itso oi tadi | 
64 *errcora od p iciowely Lidfeeereseaen oh 81) wttel bnew wanocti bidud 
66 my ‘se EER SIGE 0 - hi cisb off? 
ea a ale hares td one, loch emniged | eens 0 eh 
70 ad oy big nacisibepe, ladserahed ayes, sae goiishey le 
72 — — op ope o | = death Plime 
74 —_ —_— | age = 1 | ep — | Be 
76 A es ee ee ee ee eee ee es ee — | — 
78 ee Ne ae ee a i —- | — 
80 he | pes | ans | is Ltieseftmeqpes sisted oat 
a | | | | —— 
Lf aso | a2i* | 118 | 390 282 | 369 353 | 381 | 
No. of expts. Series rg "4 em eees eee wei, nea die. wan eee. 
No. of families 51 51 59 93 | 49 43 53 66 
measured | | | | 
wh 21-05 | 20-48 | 24-90 | 27-25 22:90 | 20-38 25-72 | 25-64 | 
Ile 3294 | 43-98 46:37 | 75-43 119-76 | 66-40 89-64 | 83-99 | 
V1 0-79 | O72 | 067 | O57 | O87 | O18 0-45 | 0-66 | 
Ys 178. | O84 | 207 | = 1-25 160 | —0-46 | 0-70 -|. L12 | 
| | 




















* One organism failed to grow and suffered lysis. 
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one or (rarely) more families of very unequal generation times. Thus for example with Bact. 


coli anaerogenes in an experiment involving four families each of six organisms, the mean 
generation times were 
26°3, 21-7, 28-3, 27-0; 


the variances 51-8, 2-2, 56°6, 135-5; 
and the squared coefficients of variation 
0-0748, 0-00467, 0-0707, 0-186. 


Irregularities of this kind, which were found in all the species examined, appear to make it 
doubtful that any simple meaning can be attached to the coefficient of variation of larger 
samples. 

In order to test the homogeneity of within-family variance, the Bartlett criterion, often 
denoted by M (which for normal samples is distributed approximately as a x), was com- 
puted for each experiment in the usual way and the sum, =, formed for each group of 
experiments. A correction for kurtosis was applied, given by Box (1953) as 


=M’ = slat | (1+5—"5 7). 


This expression was simplified by taking a common corrective factor (1+4y,) for every 
family, so biasing M’ towards smaller values. 

Two assumptions were made in assigning y,: (i) If the underlying distribution is of Type 
III, y. is 6/g. In this case the 7, were calculated from the final estimates of Kendall’s g given 
by curve fitting (Table 9). (ii) If the underlying distribution is Yule’s, y, is very nearly 2-4. 
The results of the test given in Table 2 show that the apparent heterogeneity is not signi- 
ficant under Rahn’s hypothesis, but is markedly so under Kendall’s. To this extent it might 
be argued that Rahn’s hypothesis is to be preferred. But examination of the figures in detail 
shows that the large contributions to M’ come mainly from families containing members of 
unusually short generation time, whereas it is the long upper tail of Yule’s distribution which 
inflates the fourth moment; near the origin the ordinates are exceedingly small. There is, 
however, no sufficient reason here to suspect Kelly & Rahn’s experiments of being ill- 
controlled. 

In order to assess the extent to which the overall variance of generation time is increased 
by inconstant experimental conditions, I have carried out conventional analyses of vari- 
ance on the means of families. From Table 3 it can be seen that in every case a significant 
part of the total variance is contributed by differences in the growth rate from experiment 
to experiment. If oj and o% are the true within and between experiment variances, the 
observed between-experiment mean square will be an estimate of 

N?—<Xn? 
wh NK 1)* 
(where K is the number of experiments, N the number of families, and n the number of 
families in an experiment). On this basis we derive the figures in the last column of Table 3 
as estimates of the experimental variance of the growth rate. Except for Pr. vulgaris, 


these estimates of o%; are not great; nevertheless, estimates of the coefficient of variation 
from the raw data will be too high. 
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Table 2. Modified Bartlett test of homogeneity of variance of families 











Bact. Bact. coli Pr. 
aerogenes anaerogenes vulgaris 
(A) Kendall hypothesis: y, 0-38 0-55 0-43 
=M’ 75 78 109 
n 28 39 72 
P(=M’) 0-000 0-000 0-000 
(B) Rahn hypothesis: y, 2-4 2-4 2-4 
=M’ 41 45 60 
n 28 39 72 
P(=M’) c. 0-05 >0-1 >0-1 




















Table 3. Analysis of variance of family means 



































: Source of Sum of Degrees | teen F-ratio an 
Grgenine variation squares we square {and P(F)) ares 
E freedom g 
Bact. Between expts. 305-95 13 23°53 |\ og. ; 
aerogenes Within expts. 429-32 37 11:60 |f * 03 (0-05) 3°30 
Total 735-27 50 
Bact. coli Between expts. 314-16 11 28-56 |\ 4. : j 
anaerogenes Within expts. 389-34 39 9-98 |J ~ wi seve _— 
Total 703-50 50 
Strep. faecalis Between expts. 610-18 8 46-275: 1) as 
Within expts. 1609-21 50 oes pr ery ee 
Total 2219-39 58 
Pr. vulgaris Between expts. 2037-45 18 113-19 }\ 3°58 17-23 
Within expts. 2274-42 72 31-59 |f (<0-0001) wes 
Total 4311-87 90 
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By contrast, the coefficients of variation of generation times of families, with the excep- 
tion of Pr. vulgaris, are not significantly perturbed. The coefficients themselves (and their 
squares) are very unsymmetrically distributed, but the distribution of their logarithms, 
as shown by a graphical test, is roughly normal, and analysis of variance carried out on the 
logarithms yields the results of Table 4. (For convenience the variable was taken as 
100 log,)c*, to the nearest integer; the constant factors of course cancel in the F-ratio.) 
Strep. faecalis again had to be omitted from this test, because many families (each consisting 
of a pair of sisters only) had zero variance. However, the test was also carried out on the 
squared coefficients of variation themselves; the sense of the results was exactly the same, 
and Strep. faecalis gave an F-ratio probability above 0-1. 


Table 4. Analysis of variance of family coefficients of variation 























O . Source of Sum of roe Mean F-ratio 
Sg variance squares ~ square (and P(F)) 
1 freedom q 
Bact. aerogenes Between expts. 20386 13 1568 \  o-84 (-) 
Within expts. 67246 36 1868 J 
Total 87632 49 
Bact. coli anaerogenes Between expts. 17668 11 1606 \ 0-86 (—) 
Within expts. 72701 39 1864 jf 
Total 90369 50 
Pr, vulgaris Between expts. 159425 18 8857 \ ms ’ 
Within expts. 216582 72 ae | ee 
Total 376007 90 


























Thus the coefficient of variation is relatively stable as between experiments, but its large 
interfamily variance is difficult to account for on Kendall’s hypothesis if, as is supposed, 
the generation time of an organism is independent of its immediate ancestry. It may well be 
a product of ‘delayed fission’; that is, the lapse of an appreciable interval between the divi- 
sion of nucleus or cytoplasm, and the separation of cell wall, A single large delay in a family 
postpones the termination of one organism and the inception of two others, and so may have 
a disproportionate effect on the coefficient of variation. To admit this possibility is to 
entertain the view that the measurements may not be directly relevant to either hypothesis. 

The experiments with Pr. vulgaris were evidently less successful than the rest as regards 
reproducibility of growth rate; this is perhaps to be associated with the organisms’ poten- 
tially more complex behaviour on the static medium used. 
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Correlation between generation times 


Kelly & Rahn (1932) suggested that some of the more extreme values occurring among 
their measurements were to be accounted for by ‘delayed division’. Such delay would ob- 
viously tend to generate negative correlation between mother and daughter and positive 
correlation between sisters. Rahn (1932) showed that this tendency existed, but he did not 
consider the measurements as a whole. It is probable that delayed division in this sense is 
too rigid a conception, and that a delay of variable magnitude always occurs; this does not 
affect the conclusion. 

The product-moment correlation coefficients between generation times of mothers and 
daughters (p,,p) and pairs of sisters (pg) in the present series of measurements are given in 
Table 5. Not all the mother-daughter pairs are independent, since in a family of three or 
four generations some organisms appear twice as mothers and once as daughters. Scatter 
diagrams of the mother-daughter correlations showed the entries to be fairly uniformly 


Table 5. Correlation between generation times of mothers and daughters (py,p) and sisters (pgs). 
Number of pairs of observations, p. Parentheses indicate that the coefficient is not signi- 
ficantly different from zero at the 5 % level 











. Pomp Corr. Pss corr. 
aie PMD P for bias Pss P for bias 
Bact. aerogenes (0-069) 360 — 0-77 231 — 
Bact. coli anaerogenes (— 0-051) 318 — 0-48 210 a 
Strep. faecalis oo ae — 0-68 59 _— 
Pr. vulgaris -- --- —- 0-74 195 — 
B. mycoides —0-18 192 — 0-42 0-42 103 0-32 
B. subtilis, series (i) (0-090) 283 (0-016) 0-76 156 0-80 
B. subiilis, series (ii) — 0:37 252 — 0-34 0-65 154 0-59 
B. subtilis, series (iii) (—0-005) 248 — 0-16 0-60 169 0-36 





























clustered about their centroid. This was true also for the bulk of sister-sister pairs; there was, 
however, a tail to the distribution extending along the bisector of the angle between the 
axes, indicating very strong association between pairs of sisters of unusually long generation 
time. 

For Bact. aerogenes and Bact. coli anaerogenes p y,p is not significantly different from zero, 
and the immediate conclusion is that delay does not play a large part in determining the 
variance of generation time. If this is true, neither does it contribute much to the correlation 
between pairs of sisters (pgg) which is yet quite large. There is no constraint implicit in the 
hypotheses of Kendall and Rahn which requires sister cells to be alike; in fact, independence 
is assumed. The high value of pgg tells heavily either against the hypotheses, or against the 
relevance to them of the observations. Now p,,p is compounded of observations made on 
a number of different occasions. It is to be expected that the true mean value of 7 is slightly 
different for each experiment, and so if p,,p were really zero a small positive value might 
be obtained from the combined data. It is possible therefore that the conditions bias both 
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Pgs and pyp in favour of positive values; p44) may be appreciably negative, and pg, smaller 
than it appears to be. However, inspection of the records leaves no doubt as to the great 
similarity between sister organisms. 

In the Bacillus group, pyp is either zero or negative, but pgg remains positive to a high 
degree of significance. The septate mode of fission appears to be a more highly organized 
process than the isthmoid mode, and it probably gives rise to very appreciable delay. Its 
contribution to pp cannot be assessed without knowledge of its variance, though selected 
examples suggest that the variance may be extremely large, especially in B. mycoides. 
But the differences between the three series of B. subtilis indicate that pp is far from being 
a simple property of the organism. The growth rate in series (ii) and (iii), as will be shown 


below, was not constant; this should introduce a positive bias, yet ~,p for (ii) is the most 
negative of all. 
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Fig. 3. Abnormal development of Strep. faecalis. (a) Successive appearance of cells. 
(6) Corresponding family tree; generation times in minutes. 


The estimates of pg, and pyzp are affected by the presence of bias in the frequencies. I have 
attempted a correction in the following way. 

(i) If M and D are the generation times of mother and daughter, and R(M,D) the 
observed frequency of the two together, an estimate of the true frequency is 

R(M, D)/a(M + D), 
where a is the biasing factor of equation (2). 

(ii) If 8,, S, are the generation times of a pair of sisters, and R(S,, S,) the frequency of 
the pair, the corrected value is R(S,, S,)/a{max (S,, S,)}. The p calculated from frequencies 
corrected in this way are also shown in Table 5. No new regularity appears, and the sense of 
the foregoing remarks is not altered. 

Once delayed fission is admitted as a possible contribution to the total variance of genera- 
tion time, it becomes a matter of importance to assess its variance, since it may turn out 
so large as to render nugatory any attempt on the present lines to test the hypotheses of 
Kendall and Rahn. A single overt example of delay was found among the observations on 
Strep. faecalis. Fig. 3a shows the appearances; a single cell, instead of dividing when it had 
reached the usual size, continued to grow until its length was about three times its diameter. 
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It then split into a sphere and short rod, and fission of the rod followed in 6 min. Fig. 36 is 
the family tree, with the times of fission placed so as to correspond horizontally with 
Fig. 3a. This example is sufficient to engender suspicion that minor degrees of delay may be 
much more frequent. 

Some further progress might be made by a more detailed analysis of correlation among 
the generation times. I have not attempted such analysis because a direct experimental 
approach should be possible through the use of ultra-violet or phase-contrast illumination. 

I think it likely that the observed values of py,p will be found to result from two opposing 
effects: a small positive correlation, corresponding to the usual succession of 7 values, over- 
laid by less frequent successions of great disparity. Taken together with the large dispersion 
of within-family variance, this suggests the following picture: a very large family tree, 
representing the numerical and temporal extension of a culture, consists of a patchwork of 
areas within some of which growth is steady’ while within others it is erratic. There is a kind 
of correlation between mothers and daughters, but it is not such as the crude product- 
moment coefficient is well suited to describe. 


The generation time distributions 
(a) B. mycoides and B. subtilis 


Although no quantitative conclusions relevant to the hypotheses of Kendall and Rahn 
can be drawn from the data on these species, the observations have an important bearing 
on the theory of fission. The results of fitting the modified Type III and Yule distributions 


Table 6. Frequency function parameters and goodness of fit 





B. subtilis B. subtilis | B. subtilis 











é | pone 
Frequency function | B. mycoides | series (i) | series (ii) | series (iii) 
| | 
| | | | 
Pearson Type III with 
| allowance for bias (eq. (7)): g 4-07 611 | 7-01 | 7-47 
| m 704 =| tn a. ; ce 3-90 
x? 19-2 | 57-3 | 17-8 15-4 
n . ee 15 16 16 
P(x?) | 0-32 | 0-000 0-34 0-50 
| | 
Yule, with allowance for bias | 
(eq. (8)): g 464 | 9-48 — _ 
m | 13-3 | 8-18 oa — 
x? | S33 74:2 —_ _ 
n Po) a 14 — =n 
P(x?) | 0-22 0-000 _ ne 

















(equations (7) and (8)) are given in Table 6 (see also Fig. 6). Series (i) of B. subtilis relates 
to observations begun at 2} hr. from inoculation, series (ii) at 44 hr., series (iii) to observa- 
tions on organisms growing over diluted sheep blood. 

Not only do these species exhibit a higher dispersion of generation time than the rest; 
the lengths of the organisms are also correspondingly variable, though the parallel between 
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generation time and length at inception or termination is not exact. The first few vegetative 
organisms to grow from the spore are often very long—upwards of 304—and although the 
mean length later diminishes, the change is not so well marked as in say Bact. aerogenes at 
the beginning of the logarithmic growth phase. B. mycoides cannot be said to possess a 
normal mean length; the organisms become progressively shorter over a period which may 
be as long as 24 hr. in some media. While a diminution in mean length is occurring, the num- 
ber growth rate must be greater than the mass growth rate, and since it is initially less, 
there is acceleration at some stage. I have therefore checked the uniformity of growth 
during measurements of generation time in the following way: 

It will be recalled that all the experiments grouped into any one column of Table 1 were 
begun at the same epoch in the life of the culture, and each occupied 80 min. Number growth 
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Time from beginning of experiments (minutes) Fig. 5. B. mycoides growing on cellophane, 
Fig. 4. Number growth rate of B. mycoides. 5 hr. from inoculation. A single cell has 


split off from one end of a long organism. 


curves were constructed by adding together the number of organisms under observation 
at corresponding times in each experiment. For B. mycoides and B. subtilis series (i) the 
logarithm of this sum was a linear function of time (Fig. 4), but for B. subtilis series (ii) and 
(iii), there was marked curvature especially over the earlier part of the period. Numerical 
evaluation of the age distribution by means of equation (4) was also carried out, and 
compared with the observed age distribution at the 80 min. epoch; series (ii) and (iii) showed 
an obvious excess of young organisms, as would be expected from an increasing growth rate. 
These two series are therefore of no quantitative value. 

The extraordinarily wide range of generation times in 8. mycoides (Table 1) is a con- 
comitant of its readiness to divide at any available point (i.e. between any pair of cells). 
The exaggerated appearance of Fig. 5 is by no means a rarity; the frequency with which 
a single cell is split off from an organism of eight or more cells is diagnostic of the species. 
Such single-celled organisms are viable, but develop rather slowly, and some do not divide 
within 3 hr. of their inception; they are not spores (cf. Bergersen, 1954). B. subtilis shows 
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something of the same freedom in its earlier stages of growth, but it reverts more quickly 
to regularity. Clearly, in these species the generation time is intimately connected with the 
coarser structure of the organism, and only remotely with the nuclear fission. 

The distribution of B. subtilis at 24-4 hr. (series (i)) is notable for its approximate sym- 
metry (Table 1), in which respect it is quite outstanding; neither the Type IIT nor Rahn’s 
distribution gives an acceptable fit. 

The experiment of growing B. subtilis over blood was an attempt to test the constancy of 
the coefficient of variation of generation time. Pearce and I had found (1951) that in these 
circumstances the organisms were very short and uniform in size, which suggested that the 
dispersion would also be small. The blood used in the present work was diluted with saline 
in order to reduce its viscosity, and over this mixture long organisms were first formed as 
usual. The mean length gradually diminished, but at the intermediate stages both long and 
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Fig. 6. Generation time distributions of B. mycoides and B. subtilis, 
with modified Pearson Type III frequency functions. 


short organisms were present, each clone being predominantly of one sort or the other. As 
I have shown, the growth rate increased considerably during the course of observation. 
Despite the imperfection of the experimental conditions, the coefficient of variation (0-36) 
was rather less than in the other two series (0-40 and 0-37). 

These rather disconnected remarks indicate that the behaviour of B. mycoides and 
B. subtilis is so complex that the experiments are quite insufficient to expose it fully. The 
distributions, however, possess one regular feature worth examination. 

Fig. 6 shows that the frequency of small generation times in B. mycoides and B. subtilis 
series (i) is greater than the expectation accorded by the assumed frequency function (a 
modified Type III). This is true also for B. subtilis series (ii) and (iii) (and has since been 
observed in B. megatheriwm). The observed frequencies near the origin are uniformly so 
large as to suggest that the true frequency function approaches zero with a non-zero slope 
(I have pointed out that the observations are less liable to error in this region than else- 
where). In no case did two successive fissions occur in the same 2 min. interval. It will be 
seen from Fig. 6 that no Pearson Type III or Yule function could give a tolerable fit over the 
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higher range without being in defect at low values of 7. Both of these functions have a 
finite positive and non-zero slope at the origin only when g = 2 and then 

¥ . = i and = respectively. 
But the distributions with g = 2 have values of y, and y, quite different from those ob- 
served. Kelly & Rahn’s results for B. cereus are equally suggestive; their grouped frequencies, 
in order of increasing mean 7, are 9, 2,4,5,17,.... 

Among the unicellular organisms, one instance (Strep. faecalis, 7 = 6 min.) has already 
been noted of a generation time which was unusually short because of delayed fission in the 
previous generation. It is perhaps significant that this strain, like the Bacilli, exhibited the 
septate mode of fission. But it cannot be said of B. mycoides that the shortest generation 
times are associated with anomalous development, for reasons already given. Many 
instances of short generation time in B. subtilis can, however, be accounted for as resulting 
from an unusual succession of fissions. Fig. 7 a represents the usual succession; an organism 
about to divide has typically three completed party walls (represented by pairs of dots). 
At a time ¢, it divides at or near its centre, and the two daughters similarly at ¢,, and t,,. For 
simplicity, the organism is shown as dividing without growing. The intervals t,, —t) and 
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Fig. 7. Normai (a) and abnormal (6) succession of fissions in B. subtilis. 


t12—t, are not generally extreme. Occasionally (Fig. 7) the first division occurs at a dis- 
tance of about one-quarter of the length of the organism from one end, and a second division 
follows very shortly; then the interval t,, — ¢) is small. That half of the organism to the right 
of the arrow in Fig. 76 can be usefully regarded as an organism manqué with a negative 
generation time ft, —t,,, i.e. the associated separation of cell walls has occurred in the opposite 
order to the corresponding separation of cytoplasm. 

I believe that the impression given by the generation time distributions of B. mycoides 
and B. subtilis, namely, that the slope does not vanish near the origin, is a just one, and that 
the phenomenon is of fundamental significance for the mechanism of fission in Bacillus; 
I hope to deal with it more fully in anothe- communication. 


(6) Bact. aerogenes, Bact. coli anaerogenes, Strep. faecalis, Pr. vulgaris; Assessment of the 
genetic parameter 


The result of fitting the Type III and Yule distributions to the observations on unicellular 
organisms is shown in Table 7 A—-D. 
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Comparison of sections A and C shows at once that the Type III distribution fits the data 
better than does Yule’s, in each separate case. Moreover, the method of moments gives 
(section B) substantially the same results as the method of maximum likelihood. Applied 
to Yule’s distribution, it gives a rather better fit, with very different values of the para- 
meters (section D). This disagreement, with the moment fit better than the maximum 
likelihood, seems to indicate that Yule’s distribution is inappropriate to the raw data. 

However, the analysis of variance has shown that the crude coefficient of variation (c) 
of r must exceed the supposed true and constant value (cy) because of the experimental 
dispersion of the mean, and so the method of moments (Table 7 B and D) underestimates g 
—sampling fluctuations apart. As a further consequence, it does not follow from these results 
that Kendall’s hypothesis is to be preferred to Rahn’s. 

Kelly & Rahn’s results on Bact. aerogenes have been shown by Finney & Martin (1951) 
to be in satisfactory agreement with Rahn’s hypothesis so far as the x? test can distinguish. 
I have already pointed out that there is some doubt as to the correct assignment of mean 
generation time to the grouped frequencies, but with the same assignment as that adopted 
by Finney & Martin, viz. r=2) 7} 12) 17} 


f=0 1 7 61 


I find that the Type III curve fits less well than Rahn’s and shows no advantage in its 
estimate of c and y, (Table 8). With other assignments of generation times to frequencies 


T=2} 5 10 15 
f=0 1 7 61 
and 7T=5 10 15 20 
f=0 l 7 61 


the estimated parameters are different, but the sense of the results is the same. 

Hypothesis requires that the genetic parameter (and therefore the true coefficient of 
variation of 7) shall remain constant despite variations in growth rate. Finney & Martin 
(1951) pointed out that the best estimate of it would be obtained by fitting frequency 
functions to the data for each experiment separately, constraining g to be the same for each, 
but allowing m to differ as required from experiment to experiment. As they realized, this 
would be an exceedingly laborious process with Rahn’s frequency function, but it turns 
out that for the Type ITI function a simple maximum-likelihood solution exists. 

It is convenient to write the Type III function in the form 

g®79-1 e-orla 
“erg 

where the new parameter a is the mean of the distribution, equal to gm. Let a, be the value 
of a to be assigned for the rth experiment. Then the likelihood for that experiment is 


log L, = (g—1) X(flogr) —g2fr/a, —g(loga,) Uf + g(logg) Uf — {log I'(g)} Lf, 


where the f are the observed frequencies of 7 in the rth experiment. 
On summing over all experiments and differentiating we obtain the desired maximum- 
likelihood estimate of g: 


n{y(9) — log (9)} = ~ {(flog7)}— x {(log@,) =f}, (9) 
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Table 7. Frequency function parameters and goodness of fit 
Frequency function and Bact. Bact. coli Strep. Pr. 
method of fitting aerogenes anaerogenes faecalis vulgaris 
(A) Pearson Type III, 
maximum likelihood: g 13-4 9-48 12-8 9-08 
m 1-57 2-16 1-95 3-00 
¥ 16-1 15:1 11-6 27-1 
n ll 12 9 15 
P(x?) 0-14 0-23 0-24 0-028 
(B) Pearson Type ITI, 
moments: g 13-4 9-53 13-4 9-84 
m 1-57 2-15 1-86 2-77 
x? 16-1 15:3 10-2 23-4 
n 11 12 9 15 
P(x?) 0-14 0-22 0-33 0-076 
(C) Yule, maximum 
likelihood : g 32-9 17-1 26-1 14-2 
m 5-21 5-99 6-56 8-48 
x? 29-4 24-1 20-0 44-9 
n 12 13 9 16 
P(x?) 0-0033 0-030 0-018 0-0002 
(D) Yule, moments: g 59-9 27-2 59-2 29-6 
m 4-50 5-23 5:34 6-84 
> 22-0 13-2 15-9 42-5 
n 10 ll 8 14 
P(x?) 0-015 0-28 0-043 0-0001 
(E) Pearson Type VI, first 
two moments and 0}: g 15-1 10-7 15-9 13-1 
8 136 97-7 92-4 45-1 
b 2848 1980 2274 1202 
x? 15-4 158 11-3 30-1 
n 10 11 8 14 
(x?) 0-12 0-15 0-19 0-0074 
(F) “7ule-hyperbolic (eq. (14)), first 
two moments and 0%: g 79 36 92 55 
1, 24-3 24-3 29-7 35-1 
l, 18-1 17-1 20-6 20-6 
x? 21-2 14-7 15-1 40-7 
n 10 10 7 13 
P(x?) 0-020 0-14 0-035 0-0001 
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where a, = XUfr/Zf, 
and n is > (Xf), the total number of observations. The mixed derivatives 
a2 2 


0g da, _- da, 0a, (+8) 


of > log L, are all zero (this is the virtue of using the parameter a instead of m), and so, as 
r 
in the simple case, the asymptotic variance of 7 can be estimated as 
varg = 1/n{0,9) — 1/9}. 


Table 8. Comparison of Pearson Type III and Yule distributions 
applied to Kelly & Rahn’s results for Bact. aerogenes 























| | | 
log | m x n —° TS | V1 
SEE | —| — — = | 
Type III 10-8 2-90 17-0 8 0-030 0-30 0-61 
Yule 26-0 8-14 ¢ 9 0-094 0-33 | 1-18 
From data — —_ — — — | 0-31 | 0-91 
| | } j 





The results of applying equation (9) to the data for the unicellular organisms are shown in 
Table 9 E. The g values obviously differ significantly from species to species, and all are 
less than 20, the approximate figure arrived at by Kendall (1948) from a consideration of 
Kelly & Rahn’s experiments on Bact. aerogenes. In spite of the fact that the variance of 7 
in the present measurements on that organism is less than in theirs, the value of g is still 
only 16-5; Kendall, of course, was not aware that the apparent heterogeneity of the data was 
largely intrinsic. 

Although the preceding form of treatment is not available for Rahn’s hypothesis, a 
satisfactory approach can in any case be made through the coefficient of variation. 

A number of generation time experiments carried out on one species can be considered 
to provide a sample of frequency functions h(7,a,¢c)) drawn from a population in which the 
true coefficient of variation c, = a/a is by hypothesis constant, but the mean a is dispersed. 
In this hypothetical population of h-functions, let j(a) be the frequency function of the mean. 
Then considering the whole series of experiments together, the expectation of an observa- 
tion 7 will be ml 

dF -{ h(7T, a, C9) j(a) dadr, (10) 
a 


where a,, and a, are the limits within which a must lie. The moments about zero are then 
3) Qu 
pb, = | T’h(T, a, Cy) j(a) dadr. 
/9 aj 
It can safely be assumed that the order of integration may be inverted, so that 


hy, = [o [itera eo) drda. 
a J0 
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In particular, re | "“aj(a)da =a, say. 
Qa 


au 
w= | jlayar(ch+ Vda = (+1) 465), say. 
a 
Hence the observed crude coefficient of variation, c, is related to cy by 
c? = pos/(My)*— 1 = (c6 + 1) w(j)/a®— 1. 


(Evidently c > cy always, since (j) >a*.) Or, more symmetrically, writing c,; for the coeffi- 
cient of variation of the j-distribution, 


ce? +1 = (c§+1)(c3+1). (11) 


Thus from a crude coefficient c a corrected value c, can be calculated if the experimental 
variance of the mean is known, always under the hypothesis that c, is constant. The neces- 
sary figures are already available from the working of the analysis of variance; the o%, of 
Table 3 are estimates of 4,(j) and 2 = p1q(j)/a2. 
Application of equation (11) then yields the figures in Table 9 B. 

It is to be noted that the relation between g and c, so calculated is determined by the 
h-distribution, and is independent of j(a). 

If from the corrected values cy new estimates of Kendall’s g (= 1/c3) are calculated, the 
results agree well with the revised maximum likelihood estimates (Table 9C and E), and so 
the assumption that the h-distribution is of Type III is not an unreasonable one. 

It is now possible to derive also improved values of Rahn’s g from 


ce = P,(1)—T,(g + 1) 
Tg+)-Ty)-? 





These values are given in Table 9 D. 

Equation (10) suggests a method of developing modified frequency distributions capable 
of representing the crude data more accurately, and of being subjected to a meaningful 
significance test. Each series of experiments furnishes a number of values a, and a variance 
o%;. The exact distribution of a is not known, but a graphical examination of the individual 
a, shows a marked modal tendency. Suppose then, for convenience, that a is distributed 
as a Pearson Type V variate: 





, b8a—8—1 e—b/a 
j(@) = T'(s) eae 
with overall 7 = a = b/(s—1), 
and OF = Me(j) = 6?/(s — 1)? (s— 2). 


Then if the distribution h(7,a,c,) is of Type III, equation (10) is easily integrated, and 


be 
comes ~ . (b/g)* 70-1 rs 
Big, 8) (7 +b/g)*s’ 





(12) 


where B(g,s) is the complete beta-function. This is a Pearson Type VI distribution. 
I have fitted equation (12) to the raw data by using the first two moments and the 
variance o%, of the mean, to determine b, s and g (Table 7 E; the values of g are of course the 
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same as those in Table 9C, since the j-distribution is irrelevant). To neglect o%, and use the 
first three moments would omit part of the information available, viz. the knowledge of 
how the 7 were grouped by experiments. The goodness of fit is no better than that of the 
simple Type III function. One of the fitted curves—Pr. vulgaris, for which o% is largest— 
is shown in Fig. § (curve (i)). 


Table 9. The crude (c) and corrected (cy) coefficients of variation, 
with final estimates of the genetic parameter (g) 

















Bact. | Bact. coli | Strep. Pr. 
aerogenes | anaerogenes | faecalis vulgaris | 
i | 
aCe xn et Fate dig | a 
| (A) ¢ 0-273 0-324 0-273! 0-319 =| 
| (B) ce 0-257 0-306 0-251 | 0-277 
| {C) g (Kendall) from c, 15-1 10-7 15-9 13-1 
| (D) g (Rahn) from c, 79 36 92 55 
| (E) g (Kendall) by maximum 16-5+1:1 | 109407 | 186423 | 144410 


likelihood | | 





Another convenient choice of j(a) is to take it as rectangular with mean a and range 
2.3oa,. Then, still with a Type III distribution for h(7, a,c), equation (10) becomes 


g 7g Tg 
mnphrcets ie Ay || pest: a Sy | pay, elk. San l 

~ Seoue=h\laz V3on’? oe et 1) jar (9) 
where I(x, p) is the incomplete gamma function 


atl” got) 
a 
This distribution, fitted to the Pr. vulgaris data, is also shown in Fig. 8 (ii). The x? is 29-5, 
as against 30-1 for the Type VI, and the close geometrical similarity of the two curves 
shows that it matters little what functional form is chosen for j(a). A fortiori this is true for 
the other organisms, for which o% is much smaller. 

Yule’s distribution does not combine readily, in equation (10), with any of the standard 
distributions. However, after the result of the previous paragraph, we may take 


i.e. j(a) is a segment of a rectangular hyperbola. Then 
A = 1/log (1,/12), 
alia A(l, ‘, ls), 
oj, +a* = 3(j) = A(—2). 


Equation (10) becomes 


l ‘ 
dF = [ = e-7¥/a (| —e-"¥/a)o-1 da dr, 
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where ‘¥ = ',(g+1)—TI',(1); by equation (5), a = m¥, and by hypothesis ¥ is constant. 
Integration is immediate on writing x = e-*¥/@: 


aF =“ ((1—e-r¥Myo— {1 er Fh) dr, 


Like the Type VI, this distribution fits no better than the parent Yule distribution with 
smaller g (Table 7D and F). 
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Fig. 8. Generation time distribution of Pr. vulgaris with fitted curves. 
(i) Pearson Type VI, (ii) distribution of equation (13). 


As in the Bacillus group, there is in all four species an excess of observed over expected 
frequencies near the origin (Table 10); this is true for all the distributions of Table 7. The 
organisms of very short generation time are well distributed among the several experiments 
carried out on each species; thus the ten Pr. vulgaris organisms of 7 < 10 (see Fig. 8) are 
distributed among five experiments—they are not associated with a single experiment of 
exceptionally low mean 7. And it is evident that if the restriction, cy = constant, is main- 
tained, only extravagant dispersion of the mean could appreciably increase the ordinates 
of the fitted curves near the origin. Qualitatively, this is just such an effect as might result 
from the occurrence of occasional large delays in fission; because of the skewness of the 
distribution the presence of a short-lived daughter organism would be much more obvious 
than that of its long-lived mother. 

On the score of goodness of fit, therefore, Kendall’s hypothesis is to ke preferred to Rahn’s, 
but the proposed frequency functions do not represent the observations altogether satis- 
factorily. The large scatter of within-family variance (Table 2) is admissible only on Rahn’s 
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hypothesis, but it is associated with the presence of an excess of organisms of very short 
generation time, and cannot be solely accounted for by the long upper tail of the Yule 
distribution. 


Table 10. Observed and expected frequencies of short generation times 
(The range of 7 is, for uniformity, taken to be half the distance of the mode from the origin.) 





























| 
| Bact. Bact. coli Strep. Pr. 
aerogenes |anaerogenes| faecalis vulgaris 
Range of 7 0-9 0-9 0-11 0-11 
Observed frequency 5 ll 1 10 
Expected, Type III fitted by moments 1-7 6-8 0-6 3-4 
Expected, Yule fitted by moments 0-04 | 1-6 0-02 0-4 
Expected, Type VI | 14 5-9 0-5 2-2 
Expected, Yule-hyperbolic (eq. (14)) 0-03 1-2 0-02 0-3 
CoNCLUSIONS 


The values of the corrected coefficients of variation in Table 9, and the corresponding 
estimates of the ‘genetic’ parameter g constitute the principal quantitative outcome of 
this study. Other features of the pattern of generation times, though less precisely expres- 
sible, are not less important. 

Kendall’s g is nowhere as much as 20, and obviously cannot be identified with the number 
of genes in the organism. Kendall himself (1952a) does not in fact insist on any particular 
interpretation. Rahn’s g ranges up to about 100, which is much larger than the estimate of 
Finney & Martin (25 for Bact. aerogenes); however, private opinion among geneticists 
appears to indicate that this number is still absurdly low, and their view certainly seems 
justified by the wide range of genetically determined properties already known in Bact. coli 
and Neurospora crassa for example. 

The observations on B. mycoides and B. subtilis, though possessing a specialized interest 
of their own, are not at present susceptible of any simple interpretation, however tentative. 

It is very likely that a mechanism of Kendall’s type, i.e. a stepwise process, does occur 
during the fission of an organism, and there is no a priori reason why it should not also be 
preceded or accompanied by Rahn’s gene-duplication process. Then the one will be mani- 
fested clearly only if it is slow enough, relative to the other, to be the dominant factor in 
determining generation time. If neither is dominant (and this is a real possibility) the 
experimental picture will be confused, though it may not be recognized to be so. As things 
stand, Kendall’s hypothesis is to be preferred, but only on the ground that its formal 
expression as a Pearson Type III distribution is in fair accord with the data. Kendall and 
Waugh (Kendall, 1952a) have also proposed a modification of Kendall’s original hypothesis 
by relaxing the condition that the duration of the primitive steps of his fission process shall 
have the same frequency function for all. The modified hypothesis subsumes Rahn’s as 
a special case (though only formally, not structurally). I have not examined its practical 
consequences in detail, but it has one significant feature pointed out by Kendall: suppose 
g is estimated by fitting a Type III distribution to a set of data, then if the primitive steps 

are not all distributed alike, their number will be greater than g. 
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It would obviously be possible to test Rahn’s hypothesis by measuring generation times 
of the same species at different temperatures and when growing on different media; the 
coefficient of variation should remain constant. Under Kendall’s more reserved hypothesis, 
change of temperature would be expected to make no difference, but his g might well depend 
on the number of synthetic processes demanded by the laying down of nuclear matter or 
cell wall, for instance, and so might depend in turn on the chemical complexity of the 
nutrients offered. Further, a similar study of a wide range of organisms, including auto- 
trophs and the most exigent pathogens, would show whether or not there is any correlation 
between the dispersion of generation time and richness of capacity and structure. Rahn 
(1932), reaching to the mammals for an extreme comparison, suggests that there is such-a 
correlation. 

But I do not think that studies of this kind would be at all profitable at present. There 
are serious objections not only to the acceptance of either hypothesis, but also to the 
acceptance of the data as critical of either: defects in the frequency functions, correlation 
between generation times of sister cells, heterogeneity of variance. The hypotheses, at least 
in their primary intention, relate to nuclear processes whose effect is seen only at two or 
three removes. At every point of difficulty in the foregoing discussions the possibility 
naturally suggests itself that delayed division plays an appreciable part in the dispersion 
of generation time; that the recognizable termination of the cell succeeds the essential 
determinative process by an interval which is itself sensibly dispersed. Therefore, the 
immediate need is for improvement in technique. The use of ultraviolet illumination should 
enable nuclear fission to be seen directly, and so permit the generation time so-called to be 
analysed into its components. Extended study of cultures on cellophane over a flowing 
medium would also be valuable; when constant growth rate can be reliably maintained over 
many generations, it will be possible to calculate the dispersion of generation time simply 
from the dispersion of clone size, with great economy in time and patience (Kendall, 1948). 
Further mathematical work is also in progress (see Kendall, 19526). 


SUMMARY 

1. The conditions are discussed which must be met in any attempt to measure a frequency 
distribution of generation times of micro-organisms. 

2. Generation times of individual organisms of six species have been measured: Bacterium 
aerogenes, Bact. coli anaerogenes, Streptococcus faecalis, Proteus vulgaris, Bacillus subtilis, 
B. mycoides. 

3. The hypotheses developed by Kendall and by Rahn each connect dispersion of 
generation time with a postulated mechanism of fission; each implies a definite mathematical 
form for the distribution. Comparison with experiment suggests that Kendall’s is to be 
preferred. 

4. In Rahn’s hypothesis, one of the parameters of the distribution is identified with the 
number of genes in the organism. When every allowance is made for experimental error, 
and in the most favourable case (Strep. faecalis), the estimated number of genes is less 
than 100. . 

5. Generation times are not distributed at random, but are in part determined by a weak 
and variable hereditary factor effective only over a few generations. 

6. There are positive objections to the acceptance of either Kendall’s or Rahn’s hypo- 
thesis: correlation between generation times of sister cells, heterogeneity of variance in 








44 Some features of the generation times of individual bacteria 


small families of organisms, certain defects in the proposed distribution functions. These 
difficulties, and the effect mentioned under (5), may be due to delayed fission, that is, to the 
lapse of an appreciable period between the division of the nucleus and the observed separa- 
tion of the cell into two parts. 

7. In the Bacillus group, the measurements are not relevant to the hypotheses, because 
the organisms are multicellular. 

8. Improved techniques are required for the further pursuit of the subject. 


T. W. Pearce and Jean M. Scott have shared with me the labour of observation. Much 
of the arithmetic was carried out by R. Ash. I am indebted to Dr D. W. Henderson for his 
confidence and encouragement and to S. Peto for his helpful criticism. In the analysis of 
data I have benefited greatly from Prof. E. 8. Pearson’s guidance. Publication is by 
permission of the Chief Scientist, Ministry of Supply. 
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QUANTUM HYPOTHESES 


By 8S. R. BROADBENT 
The British Coal Utilisation Research Association 


1. INTRODUCTION 


(1-1) Statistical problems which involve a mixed population are complex, even when the 
component subpopulations can be assumed to be normal. Karl Pearson (1894) noted in his 
dissection of a frequency curve into two normal components, ‘It may happen that we have 
a mixture of 2,3,...,2 homogeneous groups, each of which deviates about its own mean 
symmetrically, and in a manner represented with sufficient accuracy by the normal curve.... 
The equations for the dissection of a frequency curve into nm normal curves can be written 
down as for the special case of m = 2 treated in this paper; they require us only to calculate 
higher moments. But the analytical difficulties, even for the case of n = 2, are so formidable, 
that it may be questioned whether the general theory could ever be applied in practice to 
any numerical case.’ 

The present paper discusses the simpler problem in which the means of the components 
are equally spaced. The hypothesis that this is the case has been called by Hammersley 
& Morton (1954) a quantum hypothesis, for the means are then a constant plus multiples of 
a basic quantity or quantum. 

Examples of such distributions will be found in the paper and in the references cited. 
It is worth noting that such rules as Brook’s law (1886), formulated by Fowler (1909) in 
his study of the Ostracoda, that ‘during early growth, each stage increases at each moult 
by a fixed percentage of its length, which is approximately constant for its species and sex’, 
and Przibram’s rule (1912) discussed by Wigglesworth (1942), that ‘the weight doubles 
at each instar, and at each moult all linear dimensions are multiplied by ~/2’, take the form 
of quantum hypotheses when the logarithms of the weights or lengths are taken. Therefore 
situations in which a quantum hypothesis applies after the variate is transformed are here 
considered. 


(1-2) Data suggest a quantum hypothesis by the occurrence of regularly spaced modes. 
From such data three results are commonly required: 

(i) an estimate of the quantum which determines the spacing of the modes, 
(ii) an estimate of the scatter within the subdistributions, and 

(iii) a demonstration that the quantum rule is not disobeyed, i.e. that the data are not 
more likely to come from some other distribution, perhaps unimodal. 

When the valleys between the modes are very well marked, there is little difficulty in 
meeting these requirements. Controversy about the interpretation of some data shows that 
a statistical treatment is required for those cases in which grouping at the modes is less 
obvious. 

In the most difficult situation we are presented with data alleged to support a quantum 
hypothesis. No independent information is available; the hypothesis has arisen from a study 
of the data and the means are estimated from the observations. We are required to test 
whether the observations are genuinely grouped about these means. For examples of this 
type, see Hammersley & Morton’s (1954) Druid Circle problem and Grant’s (1952) measure- 
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ments of the energy levels of atomic nuclei. Hammersley & Morton conjecture that in this 
form the problem is beyond present-day analytic resolution, and offer a Monte Carlo method 
of testing the supposed grouping. In this case the difficulty is that the data which prompted 
the hypothesis are also used for estimation and for testing, and it is not clear what allow- 
ance can be made in a test for such previous use. The warning given by Pearson & Chandra- 
sekar (1936) is relevant: ‘To base the choice of the test of a statistical hypothesis upon an 
inspection of the observations is a dangerous practice; a study of the configuration of the 
sample is almost certain to reveal some feature, or features, which are exceptional if the 
hypothesis is true....By choosing the feature most unfavourable to the hypothesis out of 
a very large number of features examined, it will usually be possible to find some reason 
for rejecting the hypothesis.’ We need add only that it will be equally possible to choose 
some untrue hypothesis and to find some test which favours it. 
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Fig. 1. Frequency curve on the quantum hypothesis. 


(1-3) The population we shall consider will be compounded of normal subdistributions 
with means equally spaced at #+ 2rd (r = 0,1,...,m). The probability that a randomly 
chosen member of the population is from the tth subdistribution, i.e. has expectation 


™m 
£ + 2t6, is p,; not all the p, are supposed non-zero, but > p, = 1. The frequency function of 
t=0 


such a population is shown in Fig. 1. The subdistributions may be homoscedastic, with 
common variance o?, or the S.D. may increase linearly with the mean. 


(1-4) The frequency histogram of m observations from such a population will usually 
consist of regularly spaced peaks if o/d is small and the p, are not too different. As a/é 
increases the valleys between the peaks disappear and it becomes difficult to assign an 
observation to its correct subdistribution. In these circumstances the p, determine the 
overall appearance of the population, i.e. whether it is unimodal, symmetrical and so on. 
It is clear that using this model and by suitable choice of the parameters we can approximate 
with any required precision to any distribution, and that the graduation between multi- 
modal and unimodal distributions is continuous. 

Just as the genuine quantum model merges into the unimodal model, so it is possible to 
find a quantum which fits data from any distribution, since experimental data are usually 
rational. We have only to take the H.c.F. of the observations, after adding an arbitrary 
constant, to obtain a quantum which fits the data precisely. Other constants are possible 
which fit the data less exactly, and even if the observations were not rational, a quantum 
can be found to fit the data as closely as we please. Equally, when a quantum fits the data 
we have only Occam’s principle to exclude the possibility that the true quantum is a half 
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or a third of the one we have accepted. The difficulties of estimation in a similar situation 
have been discussed by Hammersley (1950). 

Obviously when so many alternatives are so easily produced we must be careful to state 
explicitly what assumptions are being made and where independent evidence is being used. 


(1:5) In this paper the problems considered will be limited to the following: 

(i) Estimation of the positions of the modes when the observations have been allotted 
to the correct subdistributions, and these subdistributions are normal. 

(ii) Estimation of the variance of each component when the subdistributions are normal 
and either the observations have been allotted to the correct subdistributions (positions 
of the modes unknown) or the positions of the modes are known (observations not allotted 
to subdistributions). 

(iii) Testing whether to accept a hypothesis which specifies the positions of the modes 
and which is independent of the data used in the test. The normality of the subdistributions 
is not here assumed, the p, are not specified, nor are the observations allotted to subdis- 


tributions. The hypothesis against which the quantum hypothesis is compared is one of 
a class described in § (4-2). 


2. ESTIMATION: MODES 


(2-1) In this section it is assumed that the data can be allotted to the correct subdistribu- 
tions, i.e. that when we are given any observation we can say it comes from the tth sub- 
distribution, although we do not know precisely the location of this subdistribution. This 
may be possible because the data fall into clearly defined groups with every component 
known to be represented or for some other reason. For example, in the measurement of 
insects during moulting, each group or subdistribution is composed of measurements 
after a defined number of moults. 


Let y,, be the sth observation in the rth subdivision, with mean £ + 2rd. Then 
Yrs = B r 2rd + €,., 


where f and é are unknown constants (2d is the quantum), r is zero or a positive integer, and 
€,, the normal error, or deviation of y,, from its mean, with mean zero. Let r = 0,1,...,m; 


™m Nr 
&s= 1, 2, ..., Mp3 gay = 3 2 Yes = Yy. 
r= 3= 


(2-2) The method that has frequently been used to detect the quantum situation and to 
estimate the positions of the modes is to plot the means of the successive groups (or the 
experimental modes; these may also be used when observations cannot be allotted to groups) 
against r, and to draw a straight line through the points so obtained. A similar but efficient 
estimation of # and 2¢ is by the regression of y,, on 7. The method is equivalent to least- 
squares and to maximum-likelihood procedures for estimating the means of the subdistri- 
butions subject to the restrictions that they are in arithmetic progression. The results of 
the regression analysis are summarized in the three sections below. 


(2-3) €,, has $.D. o (all r). The equations giving estimates 6 and 2d for £ and 2¢ are 
b = (Zr2n, XY, — Urn, UrY,)/A, 
2d = (n=rY, — Irn, XY,)/A, 


where A = n=r*®n,—(Zrn,)?, and all summations are over r = 0. 1. ....m. 
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The properties deducible from regression coefficients apply to 6 (the intercept of the 
regression line with the ordinate) and 2d (the slope). We have: 
(i) the variance o? of each y,, about its mean is estimated by s? = S?/(m— 2), where 


$3 = {AA’—(n Er¥,- Tm, SY,F\(nd), 
and A’ =n>Dy?,—(TY,)*. 


(ii) S? has o? times the x? distribution with (n — 2) d-f. 

(iii) the variance of 2d, estimator of the quantum 29, is itself estimated by s?/A. 

(iv) s?/A is independent of 2d, so the t-test may be applied to hypotheses about 26. 

(v) The variance of b, estimator of /, is itself estimated (not independently of the variance 


of 2d) by s%(1+(Srn,)?/nA)/n. 


(2:4) 2 = 0 given; €,, has 8.D. o (all r). The estimator for 2¢ is 
2d = (r¥,)/(Lr*n,). 
r r 


(i) the variance o? of each y,, about its mean is now estimated by s3 = S3/(m—1), where 
S$} = LU ye—(XrY,)?/I rn,. 
rs r Tr 


(ii) S32 has o? times the x? distribution with (n — 1) df. 
(iii) the variance of 2d is itself estimated by s3/> r?n,. 
r 


(iv) 83/5 r2n, is independent of 2d, so the t-test is applicable to hypotheses about 26. 


(2-5) B = 0 given; €,, has 8.D. ro. This is the case in which the s.D. of each observation 
varies directly with its expected value, i.e. with 2ré and so with r. In the least-squares 
equation for estimating 2é each observation must be weighted inversely by its S.D.; we then 
obtain the maximum-likelihood estimate 


This is the arithmetic mean of the homoscedastic ratios Y,/r, i.e. in the regression model 
each observed point gives an equally precise estimate of the slope. 
The variance o? is estimated by essentially the same method as that implied in (2-3) 
and (2-4). Consider the identity 
(Y,3— 2rd)/r = (y,,— 2rd)/r + 2(d—8). 
Of these quantities the first is distributed normally with mean zero and variance o?, the 
second does not need é in its calculation, while the third has mean zero and variance o?/n. 


It follows that EE (Ypp— 278)2/r? = DE (y,,— 2rd)2/r? + 4n(d — 8)2, 


and by Cochran’s theorem that 
S3 = >> >» (Ys — 2rd)?/r? 


has o? times the x? distribution with (n—1) d.f. 
(i) The variance o? is estimated by s3 = S3/(n—1), where S? is calculated from 


S$ = DE Wale) —{E lr)}/m. 
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This is otherwise clear, since y,,/r has variance o°. 

(ii) The variance of 2d, estimator of 26, is itself estimated by s3/n. 

(iii) s§ is independent of 2d, so the t-test may be applied to hypotheses about 26. 

(iv) If we wish to set limits to a single new observation in the rth group, we must take 
into account the uncertainty of the supposed mean 2rd, which contributes a variance 
r°o?/n, and the variance ro? of a single observation in this group. The limits are 


r{2d + ty_1834/[(n + 1)/n]}, 


where ¢,_, denotes the value of Student’s ¢ with (n—1) d-f. at the appropriate probability 
level. 


(2-6) Example of estimation. Svedberg (1939) gives the molecular weights of fifty-six 
proteins determined by sedimentation velocity or by sedimentation equilibrium. For 
twenty proteins both methods have been used so that seventy-six measurements in all are 
given. Svedberg deduces a ‘law of simple multiples....If we choose 17,600 as the unit the 
majority of the proteins may be divided into eleven classes with molecular weights which 
are multiples of this unit by factors containing powers of 2 and 3. The rule is only approxi- 
mate, indicating that the underlying principle is obscured by some secondary factor.’ 
He notes seventy observations in eleven classes which he considers obey this law, and he 
gives the factor relevant to each class by which the unit is multiplied. 

We take these seventy observations as a random sample from a population of the type 
considered in § (2-5), and for each class take the factor given by Svedberg as r. It is clear 
from a study of the data that it is appropriate to suppose the s.D. of each class is linearly 
related to its mean. We can then apply the method of § (2-5) to estimate the quantum for 
which Svedberg gives the value 17,600. We obtain 2d = 17,920, s2 = 3-83 (694d.f.), i.e. 
95 % limits to 26 are 17,460 — 18,390. 

Six observations are so far from Svedberg’s supposed modes that he did rot classify them. 
When the limits within which a new observation may be expected to lie are calculated it is 
found that two of the outliers are within 95 % limits, two within 98 °/ limits, and two within 
99 °% limits. Therefore these extreme observations are not inconsistent with the remainder 
of the data. 

It may be remarked here that Johnston, Longuet-Higgins & Ogston (1945) consider that 
Svedberg’s data do not support his hypothesis. Their tests will be discussed helow in § (4-6). 


3. ESTIMATION: VARIANCE 


(3-1) In §§ (2-3), (2-4) and (2-5) an estimator for o? and the form of its distribution were 
obtained when the positions of the modes were estimated. 


(3-2) Now suppose the positions of the modes are known, independently of the data, 
but that the observations have not been allotted to subdistributions and may overlap, 
i.e. lie nearer to another mean than their own. The weights p, are not known. The distribution 
is that of § (2-1). 

Let z; (¢ = 1,2,...,”) be the distance of y,, from the nearest mode, i.e. 


2 = Yrs — (B+ 2r’d), 


where r’ is chosen to minimize | z;|. Clearly | z;|<6é, and z; is ¢,, plus or minus an integer 


(or zero) multiple of 26. For those ¢,, satisfying | €,, | <é, the distribution of z; is the same as 


4 Biom. 42 
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that of ¢,,, i.e. normal with mean zero and variance o* truncated at +6. For those e,, 
satisfying 6 < ¢,,< 3d, the distribution of z; is the same as that of (€,,— 2d) truncated at + 4, 
and so on. The effect of transforming from y to z is to cut the overall frequency distribution 
of y at ...,8—d,2+6, 8+ 36,... and to lump together the truncated portions. Since each 
subdistribution is the same, save for its mean and 7,, the result is shown in Fig. 2, where the 
subdistribution (dotted line) is transformed into the distribution of z (full line) by successively 
‘turning in’ the tails of the distribution. It will be noticed that the p, do not appear in the 
distribution of z, nor is the allocation of y,, to its correct subdistribution relevant to z;. 


Distribution of | 


Distribution of € 























Fig. 2. Lumping the normal distrbution. 


The lwmped variance of the observations, s?, is defined by 


n 
a= > 23/n. 
i=1 
We now consider the distribution of s*/é* on the quantum hypothesis when the sub- 
distributions are normal. By the central limit theorem s?/6? is asymptotically normal, and 
we therefore derive its first two moments. The mean of s?/d? is the same as the mean of 
z?/§?, and the variance of s?/d? is 1/n that of z?/é?. 
Now each 2? is equal to e? for |¢| <6, and for all other ¢ these values are repeated with 
period 26. We may represent z* by a Fourier cosine series: 





22 = 3 +—3 pa 


i > re cos ("3") 


When calculating E[z?/6*] we may integrate the series term-by-term, since it is a uniformly 
convergent series of continuous terms for all z. We obtain 


r2 272 
E{z2/8?] = ato = "ex p(-" er): 


Expanding z‘ as a Fourier series we obtain similarly 


oo 
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The two infinite series thus introduced are functions of $7%o?/é? and have been tabulated 
by Newman (1934) in this form. They are integrals of one of the theta functions. We give 
in Table 1 some values of E[z?/d?] and Var [z?/d*] deduced from Newman’s tables. 

Now we write E[z?/d?] = g(o?/é*), the function g being tabulated below. It follows that 
s*/6? is an unbiased and consistent estimator of g(a?/d?): 


g(o?/8?) = 8?/d?. 
Since we require an estimator of o? explicitly, we rewrice this equation, 
o* ~ 6%g-1(s?/d?) = d2h(s?/8?). 
This estimator is consistent and is recommended for general use rather than a maximum- 
likelihood estimator (which can be calculated if required) because of its simplicity. A table 


of h(s?/5?) is needed when this estimator is used ; it is given as Table 2, and has been obtained 
by interpolation in Table 1. 


Table 1. Mean and variance of z*/5* on the quantum hypothesis. 
The variance of 8/5" is 1/n that of z*/d* 








= ? 
' 
41°07 /6? E[z?/6?] Var [2?/d?] 47°07 /6? E[z*/é?] Var [z?/d?} 

0-0 0-0000 0-0000 1-0 0-1861 0-0516 

‘1 0203 0008 1-5 2431 0704 

2 -0405 -0033 2-0 -2785 -0795 

3 -0608 -0074 2-5 -3001 -0839 

4 -0809 -0129 3-0 +3132 -0861 

0-5 0-1007 0-0194 3-5 0-32i1 0-0873 

6 -1199 -0264 4-0 -3259 -0880 

“7 -1382 ‘0334 4:5 +3288 -0883 

8 -1553 -0400 5-0 -3306 ‘0886 

9 1713 -0461 oo 3333 0889 


























4, THE LUMPED VARIANCE TEST 


(4:1) Suppose we are in the situation of § (3-2), so that the positions of the suspected 
modes have been given independently of the data. The observations are to be used to test 
whether there is real grouping about these modes. We neither know the weights p, to be 
attached to each mode nor can we allot any observation to its correct subdistribution. The 
alternative to the quantum hypothesis has not been specified, but we wish to interpret 
the intuitive feeling that the alternative hypothesis implies no preference for the suspected 
modes and is perhaps rectangular or unimodal. We could rather not assume too much about 
the form of the subdistributions (e.g. normality), and a test which is in some sense distribu- 
tion-free will therefore have advantages. 


(4-2) In these circumstances a test using the lumped variance is appropriate. It has 
already been noted that neither the p, nor the allocation of observations to the correct sub- 
distributions are relevant to the distribution of s?. Further, when the observations are 
clustered about the suspected modes s? is small in relation to 6. The tendency of the obser- 
vations to be grouped at the modes is therefore measured by s*/d*, and the form of the 
4-2 
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subdistributions will be used only in this respect. It is, of course, true that if the true modes 
are not those suspected but are near them, and o/é is small, then s?/d? will again be small. 

If y has a rectangular distribution, z has also a rectangular distribution between —é 
and +6. If y has a smooth unimodal distribution whose spread is not small in comparison 
with 4, a little thought will show that z has approximately the rectangular distribution. 
Therefore, we take the hypothesis that z is rectangularly distributed between —d and +é 
to be our null hypothesis. On this rectangular hypothesis s?/d? is the mean of n independent 
variates each of which can be shown to have mean } and variance ;,. These are also the 
limits of the mean and variance of z?/d? on the quantum hypothesis as 7/00. By the 
central limit theorem, s?/d* is approximately normally distributed with mean 4 and variance 
7s” when n is large; for n > 20 the approximation may be expected to be good. 


Table 2. h(s?/d*). An estimate of a is given by 6*h(s?/d?) 





| g*/? | h(s*/d?) 32/2 h(s?/d*) | 
ee ee ee ee 
| 0 0-000 0-16 0-167 
| 0-01 010 17 180 
02 | 020 18 194 
03 030 19 208 
04 | 040 20 223 | 
0-05 0-050 0-21 | 0-238 
06 -060 -22 | +255 
07 -070 -23 274 
08 -080 “24 -295 
09 -090 125 | -318 
i | | 
| 0-10 0-100 0:26 0-344 
ll | ill 27 374 
12 | 122 28 -409 
13 132 -29 +452 
14 143 -30 -506 
15 155 31 ‘577 
-32 -690 











On the quantum hypothesis s?/d? has expectation less than 4. As a? increases the distribu- 
tion, although grouped, becomes indistinguishable from one of the class of alternatives. 


(4:3) The lumped variance test is a one-sided test of the rectangular hypothesis, and 
gives the probability that the value of s?/d? found, or a lower value, would occur by chance 
if z were rectangularly distributed. It will be noted that it is the alternative, and not the 
quantum hypothesis which is tested. 

The significance points, for probability levels 0-05, 0-01 and 0-001 and n = 20(5) 100(50) 
1000 are given in Table 3. If the value of s?/5? found is less than the appropriate value in 
this table, the rectangular hypothesis is rejected and the hypothesis of grouping, or quantum 
hypothesis, is accepted. It will be remembered that the rectangular hypothesis may be 
accepted if the variance o? of the subdistributions is large in relation to 6?. 

(4:4) Example: distribution of length in Labidocera euchaeta. Seymour Sewell (1912) 
measured the total lengths of large numbers of Copepoda with a view to testing whether 


ar 





= 
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they followed Brook’s law (Fowler, 1909) as stated for Somatopoda and Ostracoda. From 
his data we extract Table 4, of the distribution of length of females of the species Labidocera 
euchaeta. Sample A (Seymour Sewell’s fig. 1) consists of 497 specimens from the Rangoon 


Table 3. Lumped variance test. If s?/6*, calculated from n observations, is less than the value 
tabulated at probability level P, the quantum hypothesis is accepted at that probability level 


























| 
n P=0-05 P=0-01 | P=0-001 
Pt | = 
20 | 0:2237 0-1782 0-1273 
25 | 2353 | 1946 -1490 
30 | +2438 -2067 | 1651 
35 +2504 -2161 1776 
40 | -2558 | +2237 -1878 
| | 
45 | 0-2602 | 0-2299 0-1960 
50 | -2640 2352 -2030 
55 | -2672 -2398 -2091 
60 -2700 +2438 | 2144 
65 +2725 -2473 2191 
70 0-2747 | 0-2504 02232 
75 | -2767 | -2532 -2269 
80 -2785 -2558 -2303 
85 | -2801 -2581 +2334 
| 90 -2816 | -2602 2362 
95 | -2830 | -2622 -2388 
| 
| 100 0-2843 | 0-2640 | 0-2412 
| 150 -2933 -2767 | -2581 
200 -2987 | +2843 | +2682 
| 250 -3023 | 2895 | ‘2751 
300 -3050 | -2933 -2801 
350 | 0:3071 | 0-2963 0-2841 
400 -3088 +2987 -2873 
| 450 -3102 -3006 -2899 
500 -3114 | -3023 +2921 
550 +3124 | -3038 -2940 
600 0-3133 | 0-3050 0:2957 
650 ‘3141 | ‘3061 ‘2972 
700 -3148 -3071 -2985 
| 750 +3154 -3080 -2997 
800 -3160 | -3088 -3008 
850 0-3165 | 0-3095 0-3017 
900 -3170 -3102 -3026 
| 950 +3174 | -3108 +3034 


1000 +3178 -3114 | -3042 





River Estuary, and sampie B (his fig. 2) of 157 collected after sample A from Chittagong. 
We shall use these data to answer the question ‘Do the specimens from Chittagong show 
any tendency to occur relatively more frequently at the modes from the first sample?’ 
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It is expected from previous work that when length is taken on a logarithmic scale the 
data will be grouped about approximately equally spaced modes (any change in the growth 
factor is here being neglected). Because the groups overlap and no other information is 
available, the positions of the population modes of sample A cannot be estimated by the 
methods of § (2). When the experimental modes at 15, 22, 30, 42, 54 and 65 units of length 
(the modes at 20, 36 and 49 units are seen from a figure to be spurious) are plotted on a 


Table 4. Distribution of lengths of females, Labidocera euchaeta 
(unit of length 0-04 mm.). Semour Sewell’s data 
































Length Sample A Sample B Length Sample A Sample B 
13 2 0 43 15 6 
14 3 0 td 5 3 
15 3 0 45 6 3 
16 3 0 46 9 2 
17 1 0 47 16 2 
18 0 0 48 18 2 
19 1 0 49 23 2 
20 3 0 50 21 8 
21 2 5 51 25 11 
22 4 0 52 25 12 
23 l 0 53 28 10 
24 2 0 54 31 9 
25 4 0 55 19 9 
26 5 1 56 9 7 
27 10 2 57 4 1 
28 10 5 58 2 2 
29 12 4 59 1 0 
30 14 2 60 0 1 
31 12 0 61 1 0 
32 8 0 62 0 1 
33 5 0 63 1 0 
34 1 1 64 7 2 
35 2 1 65 12 2 
36 6 3 66 9 0 
37 5 4 67 7 1 
38 7 7 68 4 0 
39 12 § 69 3 0 
40 18 5 70 l l 
41 18 6 71 1 1 
42 20 7 

Totals 497 157 
logarithmic scale against r = 0,1,...,5 they lie approximately on a straight line. Fitting 


a line by eye we conclude that sample A suggests modes at 1-218 (0-126) 1-848 on a log- 
arithmic scale. 

We next require to test whether sample B supports this grouping, and do so by the lumped 
variance test. Taking logarithms, we calculate the square of the distance of each observation 
of sample B from the nearest mode deduced from sample A. The sum of these squared 
distances, divided by nd? = 157 (0-126/2)?, is 0-177. Comparison of this value with Table 3 
shows that departure from the value expected on the rectangular hypothesis is highly 
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significant. We conclude that sample B supports strongly the grouping we have deduced 
from sample A, i.e. that grouping is real and occurs near the modes given above. 

If an estimate of the scatter about these modes is required, and the assumptions of § (3-2) 
are justified, the method of that section gives an estimate of the variance about each mode 
on a logarithmic scale. Since s/6? is 0-177, and 6? is (0-063)?, the variance is estimated as 
d°h(s?/d2) = (0-063)? (0-189) = 0-000750. 

(4:5) Example: choosing a random point. The following simple experiment is described 
because it exemplifies the use of lumped variance. It was designed to test whether the points 
on a line chosen ‘at random’ by a number of people were in fact grouped about certain 
regular points on the line. 

Twenty subjects were chosen from workers at B.C.U.R.A. and each was presented with 
a sheet of paper on which were drawn eight parallel lines of length 12cm. Each subject was 
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Subject number 


Fig. 3. Choosing a random point. Control chart of s7/d*. 


then asked to mark points on the lines at random, i.e. without systematic placing. He was 
to make one mark on each of the first two lines, two, three and four on each of the successive 
pairs of lines, making twenty marks in all. Alternate subjects marked their lines in the 
reverse order. 

The points about which systematic grouping was expected were those which divide the 
line into (n + 1) equal segments when n marks were made. Thus when one mark was made it 
was expected the end-points and centre would be preferred, when two marks were made 
the end-points and thirds, and so on. In this situation the lumped variance test is applicable. 
When vn points are marked and / is the length of the line, é is //2(n + 1), and z is the distance 
of a mark from the nearest ‘preferred’ point as defined above. 

The values for s?/5? found in the experiment are given in Table 5, and are shown also for 
subjects in the form of a control chart. In Table 5, 1a denotes the first line on which one 
mark was made, 1} the second line on which one was made, and so on. 

The conclusion drawn is that no evidence was found that the subjects did not choose 
points on the lines at random; the data are consistent with the hypothesis that all points 
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on the line are equally likely to be chosen. Clearly more extensive experiments and other 
tests may reverse this conclusion. One subject has s?/6* at about the 99% point, i.e. his 
marks tend to avoid the regular points, and two subjects have s?/d? below the 3% point, 
i.e. their marks do cluster at these points, but these are the extremes in a sample of twenty. 
None of the s?/5? for lines, nor the general average, is significantly different from } as ex- 
pected on the rectangular hypothesis. Further, the variance of s*/6* for subjects is 0-0056; 
this does not differ significantly from the variance expected on the rectangular hypothesis, 
0-0044. 


(4-6) Finally, we consider the statistical tests applied by Johnston et al. (1945) to Sved- 
berg’s (1939) data. We have already noted, in § (1-2), the difficulty in testing a hypothesis 
which is suggested by the data used in the test. Suppose, however, that a lumped variance 


Table 5. Choosing a random point 























Subject 3/28 Subject ddie : 31 
number “9 number “( os “n 
1 0-371 11 0-199 la 0-369 
2 364 12 479 1b 294 
3 379 13 373 
4 -366 14 -414 2a 0-345 
5 *417 15 -413 26 316 
6 0-328 16 0-170 3a 0-360 
7 336 17 286 3b “309 
8 241 18 325 
9 321 19 +380 4a 0-348 
10 311 20 359 4b 364 
Average 0-342 Weighted average 0-342 














test is applied to such data, using the modes suggested by or calculated from the data, and 
that the value of s?/é? found is not significantly less than the value 4 expected on the 
rectangular hypothesis. Then we would correctly reject the quantum hypothesis and accept 
the class of distribution defined in § (4-2). The reason for doing so is that when the modes 
are suggested by the data the value of s?/d? will therefore be small, corresponding to the 
grouping of the data about these modes. In such circumstances it would be wrong to accept 
the quantum hypothesis solely because s?/d? is less than a conventional significance 
point, whereas it is proper to reject the quantum hypothesis if s?/d? does not reach sig- 
nificance. 

The ‘correlation function’ test applied by Johnston et al. is similar to the lumped variance 
test proposed in § (4-3); their statistic stands in much the same relation to the lumped vari- 
ance as the mean deviation does to the variance. The lumped variance has the advantage 
that it is useful in the estimation of variance by the method of § (3-2). Their statistic is, in 
the notation defined above, e 
F(n) = 1-2 ¥ {| 2;|/(nd)}. 


——— 
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It is distributed approximately normally about zero with variance 1/(3m) on the rectangular 
hypothesis, and a positive value indicates grouping about the supposed modes. 

They extend this definition of F(n) to those cases in which the modes are not equally 
spaced, taking as the 6 corresponding to each z; half the distance between the modes on 


either side of z;. A similar extension is, of course, possible for the statistic S z2/ (nd?) of § (3-2). 
i=1 


The rectangular hypothesis they specify in this case to be a rectangular distribution of y 
between each mode; this is unnecessarily restricted, for we can consider the class of all 
distributions of y which correspond to a rectangular distribution of z. 

Johnston et al. apply their test to the fifty-six values given by Svedberg, averaging the 
results of two determinations when these are available. They test for grouping about the 
values 2”3™ x 17,600 for all integral n and m, and obtain for F(n) the value 0-137 with 
$.D. 0-077; this corresponds to a two-tailed probability of 0-076. Since this does not reach 
significance they conclude Svedberg’s hypothesis receives no support from the evidence. 
A subdivision of the data is intended to show that only observations near the first two 
Svedberg numbers support his hypothesis. Two further tests, applied to part of the data 
only, and whose power may be suspected to be low, suggest a similar conclusion. 

The probability given by Johnston e¢ al. should properly be one-tailed, since the alter- 
native hypothesis suggests a positive F'(n). Their value therefore exceeds the 5°, significance 
level. Moreover, Svedberg proposed eleven groups, the factor multiplying 17,600 being 
of the form 2”3” for certain values of n and m only. When the ‘correlation function’ test is 
applied to the values given by Svedberg, F(n) exceeds the 0-1°% significance level. The 
lumped variance test gives the same result, indicating, contrary to the findings of Johnston 
et al., that the data do support Svedberg’s hypothesis. For the reasons already given (§ 1-2) 
no positive conclusions may be drawn from this. 


The author is indebted to Prof. G. A. Barnard, Dr J. P. Harding, J. M. Hammersley and 
D. G. Kendall for their advice, and to the Director-General of The British Coal Utilisation 
Research Association for his permission to publish this paper. 
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THE TRUNCATED NEGATIVE BINOMIAL DISTRIBUTION 


By M. R. SAMPFORD 
Agricultural Research Council Unit of Statistics, University of Aberdeen 


1. INTRODUCTION 


The negative binomial distribution has been discussed by, inter alia, Greenwood & Yule 
(1920), Fisher (1941), Haldane (1941), Anscombe (1950) and Bliss & Fisher (1953), and is 
extensively used for the description of data too heterogeneous to be fitted by a Poisson 
distribution. Observed samples, however, may be truncated, in the sense that the number 
of individuals falling into the zero class cannot be determined. For example, if chromosome 
breaks in irradiated tissue can occur only in those cells which are at a particular stage of 
the mitotic cycle at the time of irradiation, a cell can be demonstrated to have been at that 
stage only if breaks actually occur. Thus in the distribution of breaks per cell, cells not 
susceptible to breakage are indistinguishable from susceptible cells in which no breaks occur. 

Methods for estimation of the parameters of the truncated distribution are considered 
in this paper. The corresponding problem of estimation of the trunvated Poisson distribution 
has been discussed by David & Johnson (1952), who also discuss the present problem. 


2. THE MOMENTS OF THE TRUNCATED DISTRIBUTION 
The negative binomial distribution has the form (in Fisher’s notation) 


_ (k+r—1)! yp 9 ’ ; 
P(r) a (k—1)!r! (1+ per (r = Q, 1, eee, p,k>0), (1) 
so that P(0) = 1/(1+p). 


To obtain the corresponding probabilities for the truncated distribution the form (1) 
must be divided by (1 — P(0)); writing 
o=I1/(l+p), y=1-9, 
it follows that 
m* (k+r—1)! 


=. <aestenhe peat eae a= 6 ‘9 
Pir) l—ot (k—Dir! (r = 1, 2,...). (2) 
The factorial moments of this distribution are 
, (k+j—1)!y/ ) 
Ma = (k=)! wil — oF)’ (3) 
ky 
whence y= ol —w)’ 
, _ kyn+k*y? 


fe = w*%(1—o*)’ 

,  ky+k(3k +1) 4? + k7 
~~ ail—o) 

,  ky(6— 6m + w*) + k2y2(11 — 4a) + 6439? + k4y* 
ee a y 
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3. ESTIMATION BY MOMENTS 


David & Johnson (1952) suggest that the use of estimates of less than maximum efficiency 
is justifiable only if they are directly obtainable as explicit solutions of easily constructed 
equations. In discussing the truncated negative binomial, therefore, they do not consider 
estimates based on the first two sample moments, which do not provide explicit solutions, 
but confine their attention to a method using certain ratios of the first three moments, 
i.e. sample estimates of w,/u; and 3/4,. The estimates (only that of p is discussed in detail) 
are obtained easily enough, but, in consequence of the introduction of the third sample 
moment, are extremely inefficient. (For example, the efficiency for values of the para- 
meters equivalent to k = 1, 7 = 0-5is as low as 1-7 %.) David & Johnson therefore abandon 
completely the use of moments and recommend the maximum-likelihood method for use 
in all cases. 

Whether, in fact, any particular inefficient procedure is acceptable can only depend on 
the loss of information resulting from its use, the time saved by it, and the relative costs 
of the time or labour spent on observation and on analysis. Thus if an experiment involves 
observations on hundreds of experimental animals, made over a period of several years, 
ten or even a hundred hours of calculation may be dearly saved at the cost of 10% of the 
information so laboriously accumulated. If, on the other hand, the observations are made 
easily and at no great cost, the use of a convenient but statistically ‘inefficient’ method of 
analysis, coupled with an appropriate increase in sample size, may be far more ‘efficient’ 
than a tedious maximum-likelihood calculation, in the sense of giving the same amount of 
information at a lower cost. In this section I give a trial-and-error method for solving the 
moment equations using the first two moments. This method, though not explicit, does not 
take many minutes to carry through. It certainly entails far less labour than the maxi- 
mum_-likelihood calculations, and the estimates obtained have a percentage efficiency of 
80 or upwards for all but the most unfavourable combinations of parameters: for k = 1, 
w = 0-5 the efficiency is 77-5 % (§6, Table 2), as compared with the 1-7°% of the ‘three- 
moment’ estimate. This method can thus be recommended as a reasonable alternative to 
the maximum-likelihood calculations, in circumstances where a method of less than 100% 
efficiency seems likely to prove acceptable. 

Equating the population mean and variance to the corresponding sample values gives 
equations for k and o: 


ol—o) 
ky( + kn) _ 


w(i—o*) 


(4) 


m? + 8”, 





These equations can be solved by trial and error, for which purpose they are most con- 
veniently expressed in the form 


(5) 
2 d 

E=x mok4 = g—} = 0. | 
m 


The value of & can be evaluated for selected values of a, and a solution, if one exists, reached 
by successive linear interpolations. 
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To investigate whether a solution exists, and to simplify still further the computations 
required, we consider the function 


_ —xlog.x 
a ellis Gay Side (6) 


This function is tabulated, for values of x between 0 and 1, in Table 4, and can be shown 
to possess the properties 
ee Poole g(x) >0,1 as x>0,1, 
$'(x)>0, $"(x)<0, O<x<1. 


The second of equations (5) then takes the form 


2 
+—o=1. (7) 





2 
¢(w) = mexp | = (m+=- 1) o(w) 


If all the observed values equal 1, m = 1 and s* = 0, and €=1. Thus in this situation the 
moment estimation procedure, naturally enough, fails. In all further discussion, therefore, 


m will be assumed > 1. 
32 
Dy = 1 (m5) (<1) 


It is easily verified that 
satisfies equation (7). However, the corresponding value of k, obtained from the first of 
equations (5), is 0. wo, therefore, and any lower values of w (corresponding to negative 
values of &) are inadmissible: we require a solution of (7) in the range 


Wy<oU<l. 


We have ¢(0)=m>1, ¢m)=1, O°>0 (0<w<1). 


(The result on ¢” follows from that on ¢” quoted above.) There exists, therefore, at most one 
solution of (7) other than w, in the range 0 < w < 1. The condition that there shall be another 


solution less than | is 
(1)>1, 
the condition that it shall be greater than q, is 
C'(m_) < 0. 


These inequalities reduce to limitations on the relative magnitudes of m and s?, most 
conveniently expressed in the form 


1—exp = (m+5—1)} <(m+5—1) /m<tog (m+). (8) 





Neither of these inequalities is necessarily satisfied; it is quite possible to construct samples 
in which they are not. In particular, the lower inequality is not satisfied if s? = 0, whatever 
the value of m (reasonably enough; we should hardly expect to get sensible estimates by 
equating an essentially positive function to zero). However, provided very small samples 
are avoided, and provided m is not too near 1, (8) will probably be satisfied. 


} 








9st 


A ee 
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Equation (7) can be solved either iteratively, writing 


mex1 = (1 —-mexp| —(m+=—1) 6(@9)|}, (9) 


or by trial and error. Inasmuch as the solution must lie on that part of the € curve for which 
¢’ > 0, between the (unknown) minimum of ¢ and the point w = 1, it is most convenient to 
take as the first trial yalue of @ either 1, or, if it is less than 1, w = m/s? (clearly ¢> 1 for 
this value), and to work down to the solution. This approach will ensure the convergence of 
the iterative procedure (9), and should minimize the amount of labour required for the 


trial-and-error procedure. In the latter case time will probably be saved by calculating, 
for the first trial value of 7, the slope 


2 


coon [SPs }oe|-(ov— ao 


Except for the function in square brackets (= ¢’(@)) all the terms used in calculating this 
expression will already have been evaluated in the calculation of ¢. This slope can be used 
in choosing the second trial value, after which one or two linear interpolations or extra- 
polations should be sufficient. If w = 1 is taken as the initial approximate, ¢’(w) should be 
replaced by its limiting value of }. 





4. EXAMPLE OF ESTIMATION BY MOMENTS 


In an investigation into chromosome breakage, the following sample distribution of breaks 
per cell was obtained: 


r = 1(11), 2(6), 3(4), 4(5), 6(1), 8(2), 9, 11, 13. 
n= 32, Ur=110, Er? = 686. 


m = 3:4375, s* = 9-9315, s?/m = 2-8892. 
Equation (7) is thus 
3-4375 e—53267¢ + 2-8892m = 1. 
Taking as the first trial value 


@w = 1/2-8892 = 0:3461, 
we obtain from Table 4 


d = 0-5616, 
whence (d6—o)/o(1—w) = 0-9522. 
Then 5:3267¢ = 2-9915, 


and from tables of the negative exponential function, or of natural logarithms, 
e-29915 — 0.05021, 3-4375e-29915 — 0-1726, 

whence € = 0-1726+ 1-0000 = 1-1726 

and C’ = 2-8892 —(0-1726 x 5-3267 x 0-9522) = 20138, 

suggesting, as the next trial value, 


w = 0-3461 — 0-1726/2-0138 = 0-2604. 
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The remainder of the trial-and-error solution, sown in Table 1 (a), leads to the value 
w = 0:2346, 
whence k = 5-3267 x 0-2346/0-7654— 1 = 0-633. 
Alternatively, by the iterative method, starting from the same initial value, the second 
Pe w, = (1—0°1726)/2-8892 = 0-2864. 


The remainder of the calculations, leading after sixteen cycles to the same solution, are 
shown in Table 1 (0). 


Table 1. Solution of the moment estimation equation for the example of § 4 


(a) By trial and error 



































ww 7) 3-4375 e—532676 C C’ 
0-3461 0-5616 0-1726 1-1726 2-0137 
0-2604 0-4737 0-2757 1-0280 
0-2438 0-4550 03046 1-0090 
0-2359 0-4459 0:3197 1-0013 
0-2346 0:4444 0-3222 1-0000 

(b) By iteration 
wo ¢ 3-4375 e—5:32676 
0-3461 0-5616 0-1726 
0-2864 0-5018 0-2374 
0-2639 0-4776 0-2700 
0-2527 0-4651 0-2886 
0-2462 0:4578 0:3000 
0-2423 0-4533 0-3073 
0-2397 0-4503 0-3123 
0-2380 0:4484 0-3154 
0-2370 04472 0-3175 
0-2362 04463 0-3190 
00-2357 0:4457 0-3200 
0-2353 0-4452 0-3209 
0-2350 0-4448 0-3216 
0-2348 0-4446 0-3219 } 
0-2347 0-4445 0-3221 
0-2346 0:4444 0-3222 
0-2346 
| ea 





(Intermediate stages of the calculations are shown in Table 1 for the sake of the example. 
In practice either calculation can be carried through on a desk computer, recording only the 
successive values of w and, for the trial-and-error process, ¢.) 

Four decimal places have been retained in the estimation of w as a demonstration of 
the degree of precision to which the moment equations can be solved by this method. In 
fact, for this example the variances (§ 5) are so large that there is no practical point in re- 
taining more than two decimal places for w and one for k. 


-_ — me bel 
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5. THE VARIANCES OF THE MOMENT ESTIMATES 


The asymptotic variances and covariances of the moment estimates of w and k are given 
by the formulae 


JV) = (Fe). Vem) — 2 4 cov (mi, ms) + (8) ving), 


ok ok 
rs) ‘ , ‘ , , , 
J* cov (wo, k) = -S V(m,) + (soa oe) cov (Mm, m, a V(m4), 
Ofts\? ; Om; 0 Q , 
J?V(k) = (2) V(m;) — 223 hs cov (misma) + (22) Vm), 
where = on — $(o*)}, 
Oy kt1y! 
a 
OMe Ma, Me k 
fe =F 4 (2_ got), 
Ofte ee Le k+l 
io “ae — o*ttu3}, 
y= 520s Ooh Oy 


dk Oo Ow ok’ 
and nV(mj) = (Mg—4?), meov (my, mg) = (Wy—fy fg), mV (mg) = (Wg — Mg”). 


Inserting into these formulae the moment estimates obtained in the example of the previous 
eee wm = 0:2346, k = 0-633, 


we have (log,w = — 1-44988, w* = 0-3994): 
i, = 3438578, i = 21-758578, us = 215-773952, ui, = 2941-290671, 
whence V(mj,) = 0-31046123, cov (mj,m3) = 4-4048558, V(mj,) = 77-120467. 
Also Gui /0k = 21164, Oui/@w =—12-9798, dui/dk = 24-6106, duj/dw = — 184-1591 
J = — 7031365336, 
whence V(m) = 0-015091, cov(w,k) = 0-08125, V(k) = 0-4983. 


6. THE EFFICIENCY OF THE MOMENT PROCEDURE 


It has been shown by Fisher (1941) that the method of moments may be seriously inefficient 
for the estimation, from complete data, of the parameters of the negative binomial dis- 


tribution. Investigation of the corresponding efficiency for estimation from truncated data 
therefore seems desirable. 


The determinant of the variance-covariance matrix of the moment estimates reduces to 


a(k+1)o%(1—ot) (1 — w*{1 + key + §k(k +1) 9} 
‘. n2y * fl — ol —y— k(k + 1) log w)}* 


The determinant of the information matrix for maximum efficiency is 





n{1—o*[{1+ky}} {2 a” (r—1)!k! = =kw*(n+logo)? 
wy(l—w*)® |e r (k+r—1)! {1—o*[1+ky]}}’ 
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whence the reciprocal of the efficiency of the moment method, given by the product of 
these two expressions, is 


1 {1—o*(1+ky+ $k(k+ 1) 97} {1 —o*[1 + ky} 
E {1—o*[1 — ky —k(k + 1) log a]? 

clo & MA(r-W(k+ UY! 2k(k+ 1) 0" “82/'\. 
52 7 (k+r-1)!  [1—o*(1+hy)] 7 


Table 2 shows the percentage efficiency of the moment method for selected values of k, 
and of the mean of the complete distribution. From this it is clear that even for quite 
small means there may be a serious loss of efficiency for low values of k. 

For the example of § 4, k was estimated as 0-633, and the mean of the complete distribu- 
tion = ky/w = 2-065, from which estimation by moments would appear to be about 70%, 
efficient in this case. 





& 








t 


Table 2. Percentage efficiency of the moment method of estimation, for selected 
values of k and of the mean of the complete distribution 

















k | | | 
0-5 l 2 | 3 4 5 
Mean 2 | 

| a ila 
05 | 824 90-6 95-7 975 | 983 98-8 
10 =| = 746 84-7 92-4 954 | 969 97-8 
20 | 661 17-5 87-5 92-0 94-4 95-9 
50 0 | DCL COB 79-2 85-5 89-2 91-6 

| 

















7. ESTIMATION BY MAXIMUM LIKELIHOOD 


In view of the results of the previous section, it seems desirable to consider the maximum- 
likelihood estimation procedure in some detail. 
The log likelihood is 


ot (k+r—1) 


=o k—yet EO 





log L = > n, log 
1 1 


= nklog w —nlog (1—w*) + Sn, log (1—w) — 5 n, logr! + En, ¥ log(k+j—-1), 
1 1 r=1  j=1 


giving the maximum-likelihood equations 











nk nm 
wil—-s*) (l-#)” er 
nlogow 2 id 1 
d + Lm = 9, 12 
ri 1—o* Ne EEN) m4) 
which, following Haldane (1941), is conveniently rewritten as 
nlogw £& : R 
Tage tS 45-4 Em =, (13) 


where R is the highest observed value of r. 


of 





im- 


(13) 
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These equations are easily soluble by the usual maximum-likelihood iterative procedure. 
The components of the information matrix are the expected values of the quantities 


_@loghL _ nk[1—(k+1)0*] nm 




















om  wo(l—-o)? —(l—o)’ 

_@logL  —_ n{l1—(1—klogaw) a} , ‘ie 
Omok wl—-o)? (14) 
2 R R k 

a ae tet MD Satta eee, 
ae 7 : 


i=j ( a o*)? 
The iteration, however, is most conveniently carried out using the quantities (14) themselves, 


rather than their expected values. An example of the calculations for this method is given 
at the end of the next section. 


8. AN ALTERNATIVE METHOD FOR THE SOLUTION OF THE 
MAXIMUM-LIKELIHOOD EQUATIONS 


From equations (11) and (13) it follows that 
o(@) = — 


where @ and k& are maximum-likelihood estimates, and m is the sample mean. Equation (11) 
can be rewritten 


G log, @ k = ~ . 
=— ¥ (k+j-1)7 En, 1 
1 A m ;= > | J & fon Tr (15) 





B(1-B*)m _ | (16) 
ki —@) 


For a given value of k, equation (15) can be used to evaluate the corresponding ¢ and hence, 
from Table 4, 7; the value of y can then be calculated. This form of the equations was 
presented by David & Johnson (1952); they suggested an iterative method, equivalent in 
the present notation to sear = [ki (m,. k,)), 


(6, k) = 


and provided a table of values of p = (l—a@)/om for values of ¢ (in their notation, y) 
0-40 (0-05) 1-00. This process, however, appears to converge rather slowly, and a solution 
can, in many cases, be reached more expeditiously by a trial-and-error process, starting 
with the moment, or some other inefficient, estimate of k. For this value of k the value of y 
can be evaluated, together with the slope of the curve (wv = w(k)) defined by (15) and (16): 


i(k) = + — [fy — mor [E_* F (e+ j5-1)2 S | -¥ git 17 
Wk) = sro —motety [ES kj 1S | —EU—dt@h]. (17) 
These values provide a second approximation to the root, which can then be found by a 
process of successive linear interpolation and extrapolation. Final adjustments may be 
made, if required, by the usual maximum-likelihood iterative procedure: this will usually 
be unnecessary, but the quantities (14) can be calculated, to provide an estimate of the 
asymptotic variance-covariance matrix of the estimates. 

As with the moment method, extreme samples can occur for which the equations (15) 
and (16) have no solutions with k > 0. Unfortunately, the function y(k) is of considerably 
more complicated a form than the function ¢(w) occurring in the moment method, and the 
existence problem is correspondingly more difficult of solution. It can be shown that 


y(0) = 1, 
"ie Ym ye 2(j—1) 2m r+ bloga, 


5 Biom. 42 


R 








66 Truncated negative binomial distribution 


where p(a) = 1/m, 

and that lim w(k) = m(1—e-*)/0, 
ro eg Bh R 

where G= nm ja I~ 1) BM 


Clearly the conditions 
y'(0)>0, lim W(k)<1 
k—>@ 


are sufficient to ensure the existence of at least one solution, and I would conjecture that 
they are necessary, and also sufficient to ensure uniqueness, but have been unable to prove 
these results. 

The question remains: What action is to be taken when the maximum-likelihood (or 
moment) method fails to give an acceptable solution? It seems reasonable to hope that one 
or other of the two limiting forms of the negative binomial, the Poisson distribution and 
Fisher’s ‘logarithmic series’ distribution, will provide an adequate fit for all but some 
pathologically extreme (and highly improbable) samples which are unlikely to be fitted 
satisfactorily by any meaningful distribution. 


9. AN EXAMPLE OF THE MAXIMUM-LIKELIHOOD CALCULATIONS 


The data are those already used in § 4; details of the calculations are shown in Table 3. The 
quantity &n,, used in the calculation of ¢, is tabulated in the third column of Table 3 (6), 
and the remainder of the calculations are shown in some detail in Table 3 (a). In fact, this 
table shows far more detail than need be recorded in practice; if the calculations are made on 
a desk machine only the first three columns of Table 3 (b) and the successive values of k, 7 
and y need be written down. (The weighted sums of quotients in the second column of 
Table 3 (a) can be accumulated on the machine, and they, together with the values of ¢, 
logw and w*, are used as soon as they are calculated, and need not be written down.) 
However, ¢ and w* should be recorded for the first trial value, for use in evaluating y’, 
and if final adjustments are to be made by the iterative process, all the quantities tabulated 
in Table 3 (a) should be recorded for the last trial value of k. The fourth and fifth columns of 


R 
Table 3(b) are recorded for the purpose of calculating X(k+j—1)-* ¥ n,; the fourth, 
r=j 


(0-63 +7 — 1)*, for use in calculating y’; the fifth, (0-493 +7 — 1)?, for use in calculating the 
variance-covariance matrix. The only remaining values required in this section are 


nm=110, m= 33-4375: 
As an initial trial value the moment estimate of k, 0-633, was rounded to two decimal 
ae, ore y/(0-63) = 0-9922, 


As a guide to the location of the second trial value, y’ was evaluated for k = 0-63. Accumn- 
lating on the machine, R 

¥ (0-63+j-1) & n, = 92-2942, 
j=1 r=j 


whence 


_, _ (09922 —0-3179) (0-7008—0-5286) _0-9922 x 0-3905 
wy, 0-2094 0°63 





= — 0-0605. 
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The second trial value was therefore taken as 
0-63 — (1 — 0-9922)/0-0605 = 0-50. 
The fourth value, k = 0-493, gives a value of y equal to 1 to four decimal places. There is 


no hope of improving further on this estimate by the trial-and-error method, without 
taking more decimal places, and using a more complicated interpolation formula, in the 


Table 3. Maximum-likelihood calculations for the example of §8 


(a) Trial-and-error calculations 








R 
k iXk+j¥-l Sn} ¢ o log, w ot y y’ 
r=j 
0-63 77-0905 0-4415 0-232) — 1-46059 0-3984 0-9922 — 0-0605 
0-50 91-9239 0-4178 0-2124 — 1-54930 0-4609 0-99952 
0-491 93-2185 0-4161 0-2110 — 1-55590 0-4658 1-00016 
0-493 92-9270 0-4165 0-2113 — 1-55450 0-4647 0-99996 
































(6) Subsidiary tabulations 






































Expected values of n,; 
i | ™% | Sn, | G-O87* | G-0-507) naguaie Uineniaeh 
r=j Truncated Log 
Poisson series 
Moments} M.L. 

1 11 32 0-3969 0-243049 4:01 10-31 10-80 13-32 

2 6 21 2-6569 2-229049 6°64 6°44 6:36 5-85 

3 4 15 6-9169 6-215049 7-33 4-33 4:17 3-43 

4 5 11 13-1769 12-201049 6-07 3-01 2-87 2-26 

5 0 6 21-4369 20-187049 4:02 2-13 2-03 1-59 

6 1 6 31-6969 30-173049 2-22 1-53 1-47 1-16 

7 0 5 43-9569 42-159049 1-05 1-11 1-07 0-88 

8 2 5 58-2169 56-145049 0-43 0-81 0-79 0-67 

9 1 3 74-4769 72-131049 0-16 0-60 0-59 0-53 

10 0 2 92-7369 90-117049 0-05 0-44 0:44 0-42 
11 1 2 112-9969 110-103049 | 

12 0 1 135-2569 132-089049 0-02 1-29 1-41 1-89 
13 1 1 159-5169 156-075049 | 








table of ¢. There is, in fact, no great need of any further improvement, but, if desired, final 
increments tok anda can be calculated using the variance-covariance matrix, which requires 


calculation in any case. 


The fifth column of Table 3 (b) shows values of 


(0-493 +j—1)%, 


and from these values (accumulating quotients on the machine) 





R 
¥ (0-493 +j — 1) = n, = 145-196464173. 
r= 
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Then, from this value and the entries in the last line of Table 3(a), the estimate of the 
variance-covariance matrix, calculated from formulae (14), is 


\ gh ee mained 


Truncated negative binomial distribution 


554-4186 


'V] = — 94-69298 
— 94-69298 


19-792976 0-0471853 0-276265 


Inserting the values k = 0-493, w = 0-2113 into the left-hand sides of the maximum- 
likelihood equations (11) and (12), we obtain 0-006 and — 0-0003 respectively. Thus the final 
increments are 


a yf 98] _ anes 
dk | —0-0003} — |0-00020 |” 
Table 4. The function ¢(x) = eee 











| 





x 0-00 0-01 0-02 0-03 | 0-04 | 0-05 0-06 0-07 |; 0-08 0-09 ] 

| | 

7” | che R saGiing 5 ae RAE one Re | 

| 00 | 0 0:0465 | 0-0798 | 0-1085 | 0-1341 | 0-1577 | 0-1796 | 0-2002 | 0-2196 | 0-2381 | 

O-1 | 02558 | 0-2728 | 0-2891 | 0-3049 | 0-3201 | 03348 | 0:3491 | 0:3629 | 0:3764 | 0-3896 | 

0-2 | 0-4024 | 0-4149 | 0-4271 | 0-4390 | 0-4507 0-4621 | 0:4733 | 0-4843 | 0-4950 | 0-5056 | 

0-3 | 0-5160 | 0-5262 | 0-5362 | 0-5461 | 0-5558 0-5653 | 0-5747 | 0-5839 | 0-5930 | 0-6020 | 

0-4 | 0-6109 | 0-6196 | 0-6282 | 0-6367 | 0-6451 | 0-6533 | 0-6615 | 0-6695 | 0-6775 | 0-6854 

0-5 | 0-6931 | 0-7008 | 0-7084 | 0-7159 | 0-7233 | 0-7307 | 0-7380 | 0-7451 | 0-7522 | 0-7593 | 
| 0-6 | 0-7662 | 0-7731 | 0-7800 | 90-7867 | 0-7934 | 0-8000 | 0-8066 | 0-8131 | 0-8195 | 0-8259 

| 0-7  0-8322 | 0-8385 | 0-8447  0-8509 0-8570 | 0-8630 | 0-8690 | 0-8750 | 0-8809 | 0-8868 | 

| 0-8 | 0-8926 | 0-8983 | 0-9041 | 0-9097 0-9154 0-9209 | 0-9265 | 0-9320 | 0-9374 | 0-9429 | 

0-9482 | 0-9536 | 0-9589 | 0-9642 _ 0-9694 0-9746 | 0-9797 | 0-9848  0-9899 | 0-9950 | 

| 

J 


| 
| 0-9 
| 


The maximum-likelihood estimates are therefore 
& = 0-2113 + 0-0993, & = 0-493 + 0-526. 


The variance-covariance matrix for the moment estimates, recalculated in terms of the 
maximum-likelihood estimates to provide a more valid comparison, is 


0-0693156 


0-01364125 
0-405868 


0-06931 4 


The ratio of the determinants of the two matrices, converted to a percentage efficiency, is 
100 x 0-000498/0-000732 = 68-1 °%. 


The efficiency is much the same for the two parameters, the actual values being 72-3 % 
for w, and 68-1 % for k. 

The maximum-likelihood estimate of k does not differ significantly from 0, which suggests 
that the data might be adequately fitted by Fisher’s logarithmic series distribution. This, 
in fact, proves to be the case. The last four columns of Table 3 (b) show the expected numbers 
of cells with j breaks on the basis of four fitted distributions, a truncated Poisson (fitted by 
maximum likelihood), the two negative binomials fitted above, and the logarithmic series 
distribution (fitted by maximum likelihood). The fit of the Poisson distribution is obviously 
very poor; equally obviously the other three are all very good. The negative binomials 
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appear, superficially, to give a slightly better representation of the data, but this impression 
is largely due to the very good agreement in the first group. In fact the justification for 
using the negative binomial rather than the logarithmic series distribution comes, not 
from the data presented here, but from the whole series of experiments, only one of which 
is used here. In this series some distributions could be fitted adequately by the truncated 
Poisson, some by the logarithmic series distribution, and some not very satisfactorily by 
either, but the negative binomial form, with appropriate k, gave a good fit in all cases. 


I am indebted to Dr C. E. Ford, of the Atomic Energy Research Establishment, Harwell, 


for permission to use his data in my examples, and to Miss A. D. Outhwaite, for drawing 
my attention to an omission from § 3. 
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THE RANDOMIZATION ANALYSIS OF A GENERALIZED 
RANDOMIZED BLOCK DESIGNT 


By M. B. WILK 
Iowa State College 


1. INTRODUCTION 
(1-1) The experimental situation and design 

Suppose that ¢ treatments are given whose properties (yields, responses, effects, etc.) we 
wish to compare when they interact with a given set of rs experimental units, the latter 
being classified into r blocks, each containing s = pt units. Suppose, further, that an 
experiment is carried out in which the treatments are applied at random to the experi- 
mental units, with the restriction that each treatment appears with p units in each of the 
r blocks. We refer to this design as the generalized randomized block design and note that it 
includes as special cases the completely randomized design (r = 1, p > 1) and the randomized 
block design (r>1, p = 1). 

The object of this paper is to study the basis for statistical inference which is provided 
by the randomization procedure. 


(1-2) Some previous work 

The introduction of the device of randomization in the statistical design of experiments 
is due to R. A. Fisher (1926, 1935). A brief review of some important contributions to the 
problem of inference from randomized experiments is given by Wilk (1953). 

In particular, we note here that the pattern of the present study leans heavily on the finite 
model analyses given by Kempthorne (1952a, 6). The means and variances of the analysis 
of variance sums of squares obtained by Welch (1937) and Pitman (1937) for the ran- 
domized block design, and by Kempthorne (1952) for the completely randomized design, 
and the expectations under randomization of the analysis of variance mean squares for 
randomized blocks with non-additivity given by Kempthorne (1952a), derive as special 
cases from the results (Wilk, 1953) on which the present paper is based. 


(1:3) Haperimental error and randomization 

In many experimental situations it seems reasonable to distinguish two sources of 
experimental error, namely, the failure of different experimental units treated alike to 
respond identically, and the inability to reproduce an applied treatment exactly. The first 
of these, which we shall refer to as the unit error, stems from variation among the experi- 
mental units. The second type, which we shall call the technical error, stems from limitations 
on experimental technique. 

Generally, the unit error is to be regarded as a fixed quantity associated with any given 
experimental unit; while the technical error may often be idealized as a random variable, 
say following a normal! distribution with mean 0. The process of randomization is, usually, 
irrelevant to our conception of the technical errors, but is of critical importance in our 
treatment of the unit errors. It might in fact be said that the main function of randomization 
in experimental design is to control, in a statistical sense, the unit errors. 
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To focus attention on the basis for statistical inference which is provided by the random- 


ization procedure we shall in this paper assume that the only important source of experi- 
mental error is the unit errors. 


2. THE ANALYSIS OF VARIANCE 
(2-1) The conceptual underlying population 
With the possible application of each of the treatments to each of the experimental units 
we associate a real (unknown) number. This defines a set of rst numbers which we take to be 
the conceptual underlying population for this experiment. Since the fact of the experi- 
mental situation is that each unit can be ‘used’ only once, the population defined is con- 
ceptual in the sense that only a subset of rs of the numbers of interest can be observed. The 
scope of a statistical inference for this situation can be delineated by noting that the con- 


ceivable totality of experimental information would be given by applying each treatment 
to every experimental unit and observing the response. 


(2-2) The population model 

Let + = 1,2,...,r denote the block number. 

Let j = 1, 2,...,8 denote the experimental unit number within each block, where s = pt. 

Let k = 1, 2,...,¢ denote the treatment number. 

Let ¥;;, represent the (conceptual) response which would be obtained if treatment k 
were applied to the jth unit in the ith block. Thus our underlying population is the set of 
(conceptual) unknown numbers {y;,;}. 

We will employ the usual dot convention for means, for example, y__, = ¥ y;;x/(r8). 

G 


We now define H=Y...5 
b=H%.-Y..; 
= YY...» 
(bt)ix = (Yi.n—Yi..—-Yiie-Y...)s 
C13 = Veg. — Y.. 


Mise = (Yijn—Yig.)— Yi. —Yi..)> 
where ¢ = 1,2,...,7;9 = 1,2,...,8; = 1,2,...,¢. 

These quantities may be given a physical interpretation. 

jis the (conceptual) overall mean yield which would be obtained if each treatment were 
applied to every unit in every block. 

b, is the difference between the (conceptual) mean yield of all treatments on all units of 
block 7 and yw, and may be thought of as the effect attributable to the ith block. 

In an analogous way t,, may be thought of as the effect attributable to the kth treatment. 
We note that by definition this is the average effect over all experimental units. 

(bt), is the difference between the effect of treatment kon blocki and t,. Thusit is a measure 
of the extent to which treatment k and block i interact and will be called the block-treatment 
interaction. 

e,; gives the difference between the (conceptual) mean of the yields of all treatments on 
unit 7 of block i and the mean over the whole block. It, therefore, measures the extent to 
which the jth unit deviates from the other units of block 7 and will be called the unit error. 
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The set {e;;}, 7 = 1,2, ...,8, can be used to give a measure of the heterogeneity of units within 
the ith block. 

2,5, is the difference between the effect of treatment k on unit j of block i and the effect 
of treatment k over all of block. It is, therefore, a measure of the extent to which treatment 
k and unit j of block 7 interact and will be called the unit-treatment interaction. 

The following equation is an algebraic indentity: 


Yige = MO +t t (bt) in + Cig + Mize 
From the definition of the quantities it follows that 
Lh = Dh =X (Stine = X (Stine = Deg = Di Mijn = Min = O. 
i k > k j j k 


Algebraically the sets of numbers {b;}, {t,}, {(bt);x}, {e;;} and {n;;,} are pairwise independent 
in the sense that, for example, t,, = 0 for all k does not imply (bt),;, = 0 for all k. In particular, 
if (bt),,, = 0 for all i and k, then algebraically this does not imply that n,;, = 0 for all « and k. 
On the other hand, the physical situation is such that if all the block-treatment interactions 
were zero, then one would expect that all the unit-treatment interactions would be zero. 
This follows from the fact that the experimental units are blocked so as to be more homo- 
geneous within a block than from block to block. If so, then the lack of a differential effect 
from block to block would lead one to expect that the differential effect from unit to unit 
within a block would be negligible. 

The converse is, of course, not true. If the units within a block are sufficiently homo- 
geneous then the unit-treatment interactions will be small in absolute value. But this does 
not preclude important differences between blocks and hence the possible existence of non- 
additive effects from block to block, i.e. block-treatment interactions. It would appear 
therefore that if the blocking of experimental units is successful, then the ,,, will be 
negligible while the (5t);,, may be important. 

The essential point here is that it may not be unrealistic to assume that the treatments 
react additively (i.e. unit-treatment interactions are zero) within a block even though they 
react non-additively from block to block. 

In many instances, whether we assume the unit-treatment interactions are zero or not, 
it may be reasonable to assume that the variability of units within a block is essentially the 
same for all blocks. This assumption could be idealized as 


ijk 


Lei; = (s—1)o?; 


I 


Nin = (s—l)o% (¢ =1,2,...,7; & = 1,2,...,2), 
j 


The importance of this discussion is in indicating the direction of simplifying assumptions 
in the analysis of the design. 


(2-3) The statistical model 


In actually carrying out the experiment, we will in fact observe only a (restricted) random 
sample of size rs from the set of rst numbers {y;,;,}. Let x. denote the observation obtained 
from the fth replication of the kth treatment in the ith block, where i = 1,2,...,r; 
k = 1,2,...,t; and f = 1,2,...,p for each (ik). Thus 2 ins represents the observed total 


response from all units in block i to which treatment k has been applied. 
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To write an explicit model for ¥ x, we now define some additional quantities. Lett 
f 


Dk;=1 if treatment k falls on unit j of block i 
=0 otherwise. 


Because random methods of allocation are employed, the Df, may be treated as random 
variables, and from the design of the experiment it is easy to specify certain of the dis- 
tributional properties of the D¥,. For example, 


P(Dk, = 1) = P (treatment k falls on unit j of block i) 
= p/s. 
Hence E(D§) = p/s, 
where we follow usual convention and define H(«) to be the mathematical expectation 
of «. More detail on the Df, is given by Wilk (1953). 
It is easy to see that 


2 Ming = PLM + Oe tet (Oa) + & Cis Diy + & Min Dis- 
I j 


This relation we shall call the statistical model. This formulation exhibits explicitly just 
what are the random variables involved in the model, namely, the Dk, which take on the 
values 0 or 1 with known probabilities. We note that it is the physical act of randomization 
in allocation of treatments to units that permits us to treat the Di, as random variables, 
and hence provides some basis for statistical inference. 


(2-4) The analysis of variance table 
The primitive analysis of variance for this design is simply a breakdown of the sum of 
squares of deviations of individual observations from their mean into additive components 
which can be attributed to various sources. The analysis of variance has proved useful in 
the statistical analysis of experiments in the estimation of components of variation, in the 


estimation of the variance of estimates of treatment comparisons, and in making tests of 
significance. 


As before, we use the dot convention for means, e.g. x; = } Xz ,/s. The algebraic detail 
kf 
of the analysis of variance is given in Table 1. 
We can now employ the statistical model for 2 ing and certain properties of the Df,, 


to derive the expectations under randomization of the analysis of variance mean squares. 
The detailed algebra is given by Wilk (1953). 

The results are tabulated in Table 2. 

It is of interest to note that if the n,,, are not all zero, then even if t,, = 0 for all k, the 
expectation of the treatment mean square is not equal to the expectation of the error mean 
square. Similarly for the interaction mean square. If the n;;, are small compared with the 
é,;, or if t is large, the bias is negligible. The assumptions 


De?; = (s—1)o?, allt, 
j 
dn, = (s—1)o%, alli and k, 
j 
do not affect the above discussion. 


+ A more formal definition of the Dj,, in which they appear as characteristic set functions of set- 
valued random variables, is given by Wilk (1953). 
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Clearly, for small t, a meaningful comparison of the treatment mean square (or interaction 
mean square) with the error mean square depends heavily on the assumption that the n;;, 


are negligible. 


Table 1. Analysis of variance 
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Table 2. Haxpectations of mean squares under randomization 
] 
| | Mean " , 
Due to Expectation of mean square 
| | square | 
| | 
| | 
| 
Blocks | Be | ——— +E b? 
| | rt(s— na" 7 yy 
| (t ah rp 
Treatments | T* | ———$— _ Jn}, +—— . D (t) 
| any" " tate a wpe! ’ 
| l (t—2) Pp 
| Interactions * | ———JDe3,+———_ 3. ni, + ————— ¥ (tf 
| He—1) 4 4" rt(s—1)(t 11) oh 7 r= 1)@=n : 
1 
| £ a eee 
| rror | | r(s—1)G Cis t+ aay" tik 
| | 





In contrast with similar results based on normal theory, the expectation of the mean 
square for blocks does not contain all components which appear in the expectation of the 
error mean square. The same remark applies to a comparison of blocks mean square with 
interaction mean square when all (bt); = 0. It is apparent that an analysis of variance test 
of significance of block effects, for the situation under consideration, cannot be justified 
by randomization. 


In Table 3 we give some moments under randomization of the analysis of variance sums 
of squares, under the hypothesis that ¢, = (bt);, = 0 for all ¢ and k, and using the sim- 
plifying assumptions: 


(2:5) Some randomization moments 


Nis, = 9, for all i, 7, k, 
> 4; = (s—1) 0%, for all i, 
j 
Yet; = (s—1) (28—3)o%/s, for all ¢. 
j 
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na We use Go, Bo, Ty, J and R, to denote the values of G, B, 7, I and R, respectively, under 
k the indicated conditions. 
i The results of Table 3 are derived from more general expressions, given by Wilk (1953), 
which use no homogeneity assumptions regarding > ¢?, and ey The algebraic develop- 
s] 


ment is also detailed in the same paper. 


Table 3. Some randomization moments under simplifying assumptions 





E(G,) = Gy = pt >) bf +r(s—1) 0? V(G,) = 0 
i 
E(B,) = By = pt > 6? V(B,) = 0 
i 
E(T,) = (t—1) 0? V(T,) = 2-1) o(1-=) 
Tp 
Mie = VE=Ne? Vly) = 2(r—1)(¢— 1) o¢(1-"—*) 
E(Ry) = rt(p—1) 0? Filip xe 
ore aNGh Ge i 
cov (To, 19) = see”) ye cov (T'5, Ro) = _Ap-Ue—-)) 
i rp > 
cov (Ig, Ro) a 3 2(p— 1) (¢- 1)(r— ¥) 
Pp 


It is of some interest to note that for large r or p, V(T,) approaches 2(¢— 1) o*; for large p, 
V(1Iy) approaches 2(r—1)(t—1)o*; for large p, V(R,) becomes independent of p and 
approaches 2r(t—1)o*; if p = 1, V(J,) becomes (2r—1)(t—1)o*/r, and thence for large r 
becomes independent of r. The correspondence to and divergence from the corresponding 
moments based on normal theory will be apparent. 

(r—1) 
(rp —1)(rp—r+ 1) 


goes to 0 as p increases. The correlation between T, and R, is — f angi which for large p 





The correlation between 7, and J is — J , which is — 1 for p = 1, and 








a oes to — 4 and for large r goes to zero. The correlation between J, and R, is 
he g - gers 
th - fae 
est (rp —r—1) 
ied : (r—1) 
which for large oes to — /- , and for large r goes to —1. 

ge ps : gers 
ms 
m- 3. EsTIMaTION 


' This section deals with the estimation of certain functions of the parameters of the popula- 
tion model. Variances of the estimates under various conditions are given, and the estima- 
tion of these variances is considered. Some additional results, as well as the algebraic detail, 
are given by Wilk (1953). 
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The following general notation is used: 


V(a) is the variance of « under no assumptions. 
V,(a) is the variance of « under the homogeneity assumptions that 


2% = G-Te", Emin = (8— Vom and Deis Mize = 9, 
bs 
for each i = 1,2,...,r and k = 1,2,...,¢ 


V°(a) is the variance of « under the assumption that all the n;;, are zero. 
v denotes the analysis of variance error mean square. 


(a) An unbiased estimate of wis = 2: 


~ 1 , 
V(2Z) = = = Nim and there appears to be no reasonable way to estimate > n?,,.. 
rts(3— 1) i a 
Vf) = 0, so w is known without error if the n;,, are all zero. 


(b) An unbiased estimate of 5; is b, =2%,—z.: 


es (r — bad, 2 en 23 p 2 
Md) = sale — 1) He FSha(e — 1) 
A _ | A a 
V,(6;) = ——- o7!, No reasonable estimate of either V(b;) or V,(6;) is available. 


V6,) = 0, so the block effects are known precisely if all the n,;, are zero. 


(c) An unbiased estimate of t, is é, =2.,.—2_: 





A l 
= {— 2.+2(t—2 sails — 2 — 2. 
V (t,) r2s(¢— i|( 1) Reis + 2 2) 2a ig Mize + (t 3) a Mint Ena. 
¢ (t—1) 2 2 (¢—1) , P 
V(t.) = -_ (o? + o%,) -— o7,, and ——— ee 4 tends to overestimate V,(¢,). 
V%Cé,) = ° Atte. > e,, and ~_s v is an unbiased estimate of V(é,) 
RO #a(s—1) rs atk 
(d) An unbiased estimate of (bt),;, is (bt) x = %y— 2X; — 2 ye tz,,: 
os TF ehee (¢(-l) 
Vibin = sai— yy | tO E4, + — wat, 
2(rt—r—2t+1 
+ 2-2) (r—2) Eeymag ATS ei ta 
TPC an eee 2+ 0-9) 9S nial. 
. —1)(t-1 
Vi (bt). = be Wee (0? +07)— —- o2, and so coe) v tends to overestimate 
V(t) ix- 
(e) An unbiased estimate of the treatment mean (u + ¢,) is (2+é,) = x.,.: 
A A we cant 1) 
Vii +t,) + ee r2s(3 — 1) —2 (e3; + 2:5 Nisk + N35). 


-1 
V7(i+ié,) = = il, (o? +07), an nd S—*) yi is an unbiased estimate of V;(Z +i,). 





V(fi+t,) = aan yrs and 0, is an unbiased estimate of V°(Zi+i,). 





ike 


1ate 
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(f) An unbiased estimate of the treatment contrast > p;,t,, with > p, = 0, is 
k k 


LD Pele = DP %.—*...): 
k k 
rs 1 1 
V(X Pele) = Fip(e—1) er Seely + 2S PeeisMisn + D PeMie— 7D (Sparin)?| . 


KS Puls) = = (SPW (0*+08)— a7 —7) BE (EPumin)*, and so — (SE pk) v tends to 
overestimate V;(> p;é,) . 
k 
V(x Prey) = ez) (a pi) dep and this is estimated unbiasedly by = pj) v. 

We close this section with a brief discussion of special cases. If r = 1, p> 1 (completely 
randomized design), the discussion of block effects and interactions is to be ignored, and 
the remaining results carry over directly. 

For the case of p = 1, 7 > 1 (randomized block design) the situation is somewhat different 
in that the analysis of variance error mean square which we have been using becomes non- 
existent. If we can assume that all block-treatment interactions are zero, then the inter- 
action mean square in the randomized block design has expectation 


(t—2) 


ares ee 
re— 12 _ 





Ye? 
r(t-l)F ut 
and for most of the cases discussed above, the variances of estimates may be estimated 
unbiasedly. 
Kempthorne (1952, 6) has discussed in detail the randomization analysis of the random- 
ized block and completely randomized designs. 


4. TESTS OF SIGNIFICANCE 


In this section we consider tess of significance of a number of null hypotheses (hypotheses 
of equivalence) of possible interest. The object of the test is to obtain a measure of the 
adequacy of the experiment to indicate conclusions. 

To employ a randomization test of a null hypothesis we require no assumptions about the 
form of the frequency distribution of the observations. A real-valued function of the 
observations is selected which will reflect (in a monotone increasing fashion, say) the devia- 
tion from equivalence of the treatments. 

Having obtained a set of numbers for the particular experimental arrangement, we 
associate these numbers with the experimental units from which they derived. We then 
evaluate the function selected for the actual disposition of the treatments and for all other 
dispositions possible under the restriction of the design. A measure of the strength of the 
evidence against the null hypothesis (i.e. the level of significance} of the experiment) is 
the proportion of the values of the function which exceed the ‘observed’ value. The initial 
introduction of randomization enables a probabilistic interpretation. 

In general, the amount of computation implied by the above procedure is prohibitive. 
Consequently, several writers (Welch, 1937; Pitman, 1937; Kempthorne, 1952a,6) have 
studied the possible approximation to the randomization test of certain procedures which 


t A high level of significance corresponds to a low proportion. 











78 Randomization analysis of a randomized block design 


derive from normal theory assumptions. Since the normal theory criteria fulfil the require- 
ment for a randomization test function, the approach has been to examine the correspond- 
ence by comparing randomization means and variances of these criteria with their normal 
theory analogues. We proceed to make a similar examination for several null hypotheses 
of possible interest in the generalized randomized block design. The normal theory analysis 
for this design has been outlined by Wilk (1953). 

In all that follows in this section we make the permis that the n;,, are all zero, and 
that Leis = (8-1) 0%, As = (s—1)(2s—3) o/s (¢ = 1,2,...,r). 


(a) "Consider the null hypothesis that the treatments all react identically on every 
experimental unit. This implies that y;,, = y,; for alli, 7, k, and hence t, = (bt); = nj, = 0 
for all i, 7, k. 

The normal theory test of this hypothesis may be based on the criterion 


(Ty + L)/(To+ 1+ Ro), 


which under the usual normal theory assumptions and under the present null hypothesis 
follows a Beta distribution 


fq) = Kgt(1—g)¥*  (0<q<)), 


where f, = r(t—1), f, = rt(p—1). 
Let U and U* denote the criterion above under normal theory and randomization respec- 
tively. Then O< U<1 and O< U*<1. Since the denominator of U* is constant under 
(t—1) 2(t— 1) (p—1) 


randomization H(U*) = (e=1) and V(U*)= ma Under normal theory, 


E(U) = aa 7” V(U) = (r oe “Hye Thus #(U*) = H#(U); and for large s or small r, 
V(U) approaches V(U*). 

If we can judge the correspondence of the distributions of VU and U* from the comparison 
of their ranges, means and variances, it would appear that the randomization test may be 
approximated by the usual test based on normal theory. 

To better the approximation to the randomization distribution of U* by a Beta dis- 
tribution, we might use a Beta variate Q, with density 


f(Q) = KQU-(1—-Q)¥e* (0<Q<)), 














fi _ (t—1) = R(U* 
such that E(Q) = F+h7 @=) (U*), 
2fi fo _ 2t—1)(p—-}) _ = V(U* 
and V(Q) = (fy: +fo)*( i +f +fo+2) rp(s— 1)? ( ). 
(=I) (rs-2), _ (p—1) (r8—2) 





Then f= 





(s—1) f= ~~ (s—1) 


For different assumptions regarding the quantities Lei ?, and bes (¢ = 1,2,...,r), we 


would thus arrive at different Beta distributions to ead eal the randomization test. 
Welch (1937) considered such an adjustment for the case of the randomized block design 
when the blocks do not exhibit equal variation. 
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(6) We consider now a situation in which we can assume that (bt),, = 0 for all i and k, 
and wish to test the null hypothesis that t, = 0, k—1, 2,...,t. 

The criterion suggested by normal theory for this situation is T,/(T)+4+R,). Let W 
denote this quantity under normal theory, and W* denote it under randomization. 

Then under the null hypothesis, O< W<1, O< W*<1, and W is a Beta variate with 


_ (t-1) _ 2¢—1)(re—r—t—1); _,. _ (t-1) 
Me) aye VR) r2(s— 1)? (rs —r—2) While ab ah ad nag 9 


_ 4t-1) (, 1 
i eit Bred pe ee: 


Thus E(W) = £(W*), and for large values of p or r V(W) and V(W*) are approximately 
equal. 








and 





Of course as in (a) we can define a Beta variate which will have the same mean and vari- 
ance as W* and, perhaps, in this way better the approximation. 

A special case of interest is that in which p = 1, r> 1, i.e. the randomized block design. 
In that case, R does not exist, and carrying out a meaningful analysis of vari«nce test of 
significance for that design hinges upon the assumption that the block-treatiment inter- 
actions are zero. 

Also, of particular concern here is the case of r = 1, p> 1, i.e. the completely randomized 


design. In that case, I does not exist, and the test we have discussed is the one of interest 
for that design. 


It is a pleasure to acknowledge the assistance of Prof. O. Kempthorne in the preparation 
of this paper. 
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SOME QUICK SIGN TESTS FOR TREND IN LOCATION 
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By D. R. COX 
Statistical Laboratory, University of Cambridge 


AND A. STUART 


Division of Research Techniques, London School of Economics 


1. INTRODUCTION AND SUMMARY 


Many distribution-free tests have been devised to test the hypothesis of randomness of 
a series of N observations, i.e. the hypothesis that N independent random variables have 
the same continuous distribution function. Of these, the rank correlation tests are the most 
efficient tests against normal trend alternatives, but others are of some use in situations 
where speed and simplicity of computation are important. 

In this paper, we discuss a class of simple sign tests, considered first as tests against trend 
in location. Optimum tests are found from the standpoint of asymptotic relative efficiency 
(a measure of local power in large samples), and it appears that the best of these tests may 
be preferred to the other simple tests considered in the literature, although they are, of 
course, less efficient than the rank correlation tests. 

Similar tests are available for trend in dispersion, and the efficiency of these, in the normal 
situation, is investigated and compared with the test based on the maximum-likelihood 
estimator. Finally, we add a few remarks on sequential sign tests. 

Readers not interested in the theory should look at §§ 10, 11 and 14, where there are brief 
statements of the tests and numerical examples. 


2. THE SIGN TESTS FOR TREND IN LOCATION 


We consider a series of NV independent observations from a standardized normal regression 
model with an upward trend, i.e. 


Ay: y,=at+At+e, (¢ =1,2,...,N), 


where A> 0 and the e; are independent standardized normal variates. We wish to test the 
null hypothesis 
ed H,; A=0, 

using a distribution-free test statistic, so that our test will remain valid whatever the con- 
tinuous distribution of the ¢ terms in the model, although naturally its efficiency will vary 
with the form of the distribution. 

The most efficient known distribution-free tests of H, are those based on the rank corre- 
lation coefficients (Stuart, 1954), but our object here is specifically to find tests which are 
quick and simple to compute. We define for i <j the score 


ff if ¥;>y;, 
0 if ¥<y;. 


ij 
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h,; is thus based on a comparison of the ith and jth in the series of observations. The dis- 
tribution of the observations will be assumed continuous so that the possibility of ties can 
be ignored (see, however, § 16). 

We confine ourselves throughout to comparisons of independent pairs of observations, 
i.e. no observation is compared with more than one other observation. (This is in contrast 
to the procedure used in calculating the rank correlation coefficients, where every observa- 
tion is compared with every other observation in the series.) Since there are N observations, 
there can be no more than 4N such independent comparisons. We now assume N to be even 
and always take 7<j. Our problem is to find the set of comparisons and the appropriate 
weights w,;; which will make the statistic 


as efficient a test of H, as possible. The summation in (1) contains $N terms, no suffix being 


repeated. All tests of the form (1) are distribution-free, since on the null hypothesis any 
h,; is a 0— 1 variate with probabilities (4, 4), whatever the distribution of the e,. 


3. ASYMPTOTIC RELATIVE EFFICIENCY 


We shall use as our criterion of efficiency the asymptotic relative efficiency (A.R.E.) of a test. 
If there are two consistent tests, s and t, of a hypothesis H,: A = 0, the 4.R.£. is the reciprocal 
of the ratio of sample sizes required to attain the same power against the same alternative 
hypothesis H,, taking the limit as the sample size N tends to infinity and as H, tends to H). 
(This second limiting process is necessary to keep the power of consistent tests bounded 
away from 1.) Pitman (1948) and Noether (1955) have shown that, if ¢ and ¢ both have 
normal limiting distributions on H, and H,, the a.R.E. of s compared to ¢ is given by 





_ (BX8)\"" r 
A.RB.E. (8, ¢) = Jim (Fan) : (2) 
2 

— RYX) = {laa E(X|A | [oxx |A=0), (3) 

oA a-0) 

provided that r satisfies the equations 

lim R?(s)N* = R,, lim R(t) N7 = R,. (4) 

N->o N->o 


Here E and D? denote mean and variance as usual, while R, and R, are constants inde- 
pendent of NV. The interpretation of the 4.R.£. is discussed critically in § 9 below. 


4. THE BEST SIGN TEST 
Since A;,; is a 0-1 variate, 
- E(h,;) = prob (y;>¥;), 


and as (y;—y;) is a normal variate with mean (j —7) A and variance 2 this is 





E(his) = | St (5) 
where : - je" 
G(2) - | zon" dt. 
Now LL i 
[eo], = om 9 


Biom. 42 
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so that, by (5) and (6), 





“h,) = | 2 _ G-J) 
B'(hy) = [5x0] = S. (7) 
We now write (j—7) = r,;. Using (7) in (1), we obtain 
‘ , 1 
E'(8) = 2w,; B'(h,;) = Br (8) 
We also have DS | A=0) = Dw}, V(h,;| Hy) = 4 Zw},. (9) 
Equations (3), (8) and (9) give sip AB 
. ) 4s) = 2 wea (10) 
n Lwi; 


and we now wish to maximize (10) to obtain the highest possible a.k.£. We do this in two 
stages. First we maximize (10) with respect to the w;;, regarding the r,,; as fixed, and then 
we choose the supremum of these maxima for variations in the r;;. 

To maximize (10) for fixed r;; and variation in the w,;, we must maximize Lw,,;7,;; subject 
to Xwi; being held constant, i.e. we must unconditionally maximize 


. 2 


It is clear from the conditions of the problem that each w;,,; will be a function of the corre- 


sponding 7,,, so that on differentiating F for a stationary value we get 
P' £7; g va g 
or 
Vist Wis Pesca 2Aw;; = 0, 
ij 
i.e. rig 4 eg = 2A 
Wiz,  OW;; 
This is satisfied by 


so that the required set of weights are proportional to the distances apart of the observations 
compared. The stationary value is a maximum. Substituting (11) into (10), we have 


1 
RYS) = — Eri, (12) 


This is the maximum value of R*(S) for a fixed set of r;;. The r;; are a set of $N differences 
between pairs of integers chosen from the integers 1, 2,...,N. It is easily seen that mri, is 
largest when the pairs are (1, NV), (2, N—1), (3, N — 2) and so on. In general 


ry = (N-k+1)-—k=N-—-2k+1 (k=1,2,...,4N) (13) 
iN 
so that Uri, = ¥ (N—2k+1)? = 4N(N?-1), (14) 
k=1 
and the supremum value of (12) is therefore 
_ N(N?-1) N® 
R*(8,) = ae ot ae (15) 


We have denoted by S, the optimum S statistic 


+N 
8, ae i * A allrteesba 1) hy, nea 
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for which 


E(S,|4=0) = $2(N—2k+1) = 4N?, ) 
DS, | A=0) = }2(N —2k +1)? = 3,N(N2-1).] 


The test based on S, is essentially a simplified version of Spearman’s rank correlation 
test, which is in effect defined by 


(16) 


V= 2(9—*) hij, (17) 


where the summation extends over all possible 4N(N — 1) pairs of observations. Stuart 
(1954) has shown that 


3 
RYV) ~~, (18) 


so that using (15) and (18) in (2) and (4) with r = 3, we obtain for the a.R.£. of S, compared 


to A.R.E. (S,, V) = (3)# = 0-87. (19) 


The loss of 4.R.E. involved in reducing the number of comparisons from 4N(N — 1) to 4N 
is as little as 13%. 

These values of the a.R.z. depend on the assumption of normality, but the calculation 
of the form of the optimum statistic, S,, and also of the statistic, S,, of §5, does not. For (7) 
remains true, with a changed numerical factor, for general continuous distributions. 


5. AN UNWEIGHTED SIGN TEST 
The relatively high efficiency of S, compared to V leads us to construct, by analogy, a 
simplified version of Kendall’s rank correlation test, which may be defined by 


Q= This (20) 


i<j 


and gives equal weight to all }N(N —1) comparisons. @ has the same 4.R.. as V (Stuart, 
1954). The analogous sign test, based on $.N equally weighted independent comparisons, is 


S, = hy, (21) 


and using (10), we obtain, with all w,; = 1, 
Y 2 Ww € 
R*(S,) = Na (=r;;)?. (22) 


We now require to choose 4N pairs from the first N integers so that (22) or, equivalently, 
x(j —i) = Er,,;, takes its maximum value. This occurs whenever every 7 is chosen from the 
first }.N integers and every j from the last 4N integers. In particular, it occurs when every 
ri; = $N exactly, so that 


iN 
8, = 2 hy, awk (23) 
and (22) becomes N3 
RAS.) = =. (24) 
Using (24) and (18), we obtain 
A.B.E. (Sp, V) = (4)* = 0-79, (25) 
while from (15) A.R.E. (Sp, S,) = (4)# = 0-91. (26) 


6-2 
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Thus the simplified version of Kendall’s rank correlation test is 21°% less efficient than 
Q or V, and 9% less efficient than the simplified Spearman coefficient S,. The use of 8, is 
equivalent to a test considered by Theil (1950). 


6. THE BEST UNWEIGHTED SIGN TEST 


However, we can improve on the efficiency of S,, and in fact get very nearly as high an 
efficiency as that of S,, by ‘throwing away’ some of the }N comparisons and retaining equal 
weights for the others. This was suggested by one of the present authors in the discussion 
of Foster & Stuart (1954); it leads to an increase in efficiency because, by comparing 
observations further apart, individual comparisons are made more sensitive. 

In (1), let every w;; be either 0 or 1, and let m (<4N) be the number of non-zero w;;. 
For our new statistic S, we have, as in (8), 


"79 1 Ww "7 
E' (S83) = — Fg Bass (w,; = 0 or 1), (27) 
and from (9) D*(S, | A=0) = 4m, (28) 
so that (3), (27) and (28) give 
R?(83) = —— (Sass) (w;; = 0 or 1). (29) 


To maximize this by choice of m and r,;, we again work in two stages. For fixed m, (29) will 
take its largest value when the comparisons given zero weights are based on the middle 
(N’ — 2m) observations, while every i is chosen from the first m observations and every j is 
chosen from the last m observations. In particular, this will be so when every r;; = (N —m) 
exactly, so that a 
S; = Pa hy, N—m+k (30) 


and (29) becomes m(N — ‘m)?* 


RS.) = —— (31) 
(31) is the largest possible value of R*(S,) for fixed m. (S, is the special case of S,; when 
m = 4N.) We now choose m to maximize (31}. Differentiating, we get 

m=4N (32) 
for a maximum, so that finally iN 
S; = Pe §N+k> 


for which E(S,) = §N, 
(33) 
V (Ss) call TeX, 
and from (31), . _ 4N8 : 
RYS,) = 5. (34) 
From (34) and (18), we have 
A.R.E. (S3, V) = ($$)* = 0-84, (35) 
while from (15) A.R.E. (S3, S;) = (3)? = 0-96. (36) 


Compared with either V or S,, S, has about 5% higher efficiency than S,, and in fact its 
efficiency is 96 °/, of that of S,, so that for practical purposes it may be recommended instead 
of S, because it requires no weighting of the comparisons. 
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7. COMPARISON OF THE SIGN TESTS 


In Table 1, the a.R.£. of the sign tests are tabulated, compared to each other, to the rank 
correlation tests, and to the best (parametric) test against normal regression, based on the 
sample regression coefficient b, which has a value of (3) given by 
N* ‘ 
R%(b)~s5 > (37) 
as follows immediately from the fact that 6 is an unbiased estimator of A with variance 
12/{N(N2— 1)}. 


Table 1. Asymptotic relative efficiencies of sign tests 

















Asymptotic relative efficiency 
2 aa i | Compared to Compared to 
Test statistic ree - | rank correlation | best parametric 

=e tests test 
} 
" | 

S; A lan 2k+1) hy wear 1-00 | 0-87 0-86 

1N 

S,= Dhy prise 0-91 | 0-79 0-78 
k=1 | 
3N } 

Sy = E he gree | 0-96 | 0:84 0-83 

' | | 








From (2), (18) and (37), it follows that the a.n.£. of either rank correlation coefficient 
compared to 6 is 
3\% 
A.R.E.(V,6) = (-) = 0-98, (38) 


and not 3/7 = 0-95 as given by Stuart (1954). 


8. COMPARISON WITH A.R.E. OF OTHER TESTS 


Apart from the two rank correlation tests already discussed, Stuart (1954) investigated 
the A.R.E. of three other distribution-free tests for trend in location. Two of these, the rank 
serial correlation test and the turning point test, were found to have zero values of R as 
defined by (3); the third, the difference-sign test, was found to have a value of r equal to 1 
in (4), as against r = 3 for all the tests considered in this paper. It followed that the three 
tests mentioned all have A.R.E. zero compared to the rank correlation tests (and hence to 
all the tests discussed here). Noether (1955) gives general results which rigorize these 
conclusions. 

A well-known and simple test which has not, as far as we know, previously been discussed 
from the point of view of a.R.£. is the median test, due to Brown & Mood (1951). The V 
(even) observations are divided into two sets of 4N consecutive observations. The test 
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statistic is simply the number of observations in the first set which exceed the sample 
median y,,, and it is therefore defined by 


4N 
B= Y dim, (39) 
i=1 


1 if Yi> Ym 

where ‘on = 

fc if Yi <Ym: 
The a.R.£. of B is easily obtained. We know that y, is a normal variate with mean (« + 7A) 
and unit variance. It follows that the sample median y,, is asymptotically a normal variate 
with mean (« + 4(N +1) A) and variance of order N-!. Since y; and y,, are asymptotically 
independent, (y; — y,,) is asymptotically normal with mean A[i — 4(N + 1)] and unit variance, 
so that for i<4(N +1) 


E(bim) = prob (y; > Ym) ~ 1— G{ALA(N + 1) —<}}. (40) 
Using (6) in (40), we obtain 


B' bin) ~ = Tq (MN +1)- i, (41) 
so that from (39) and (41), 
72 
E(B) = © Ein) ~ Tez EG +1)- ij~ 3 JQa)" (42) 
Also (Brown & Mood, 1951) D4B|A=0) N (43) 
=0)~7,. 
(42) and (43) give, in (3), ns 


Comparison of (44) with (24) shows that B has precisely the same 4.R.£. as S,, and is therefore 
slightly less efficient than S, and S,. If the observations are available in serial order, S, is 
simpler to compute than B, which involves ranking all the observations to find the median, 
and then making $N comparisons, as against 3. for S,. There is therefore no reason to prefer 
B to 8S, in this case. If, however, the data were available graphically, B would be consider- 
ably easier to compute, and this would outweigh the slight loss of efficiency compared to Sj. 


9. COMPARISON OF THE POWERS OF TESTS 


So far we have compared tests by the a.R.£. in the usual way. Before considering the power 
of the test S, in small samples it is convenient to examine the meaning of the a.R.E. more 
carefully. If the a.R.x. of a quick test relative to an efficient test is A, then asymptotically 
A-! as many observations have to be made for the quick test to give the same local power 
as the efficient test. This is directly relevant if in designing an experiment a choice has to 
be made between, on the one hand, using an efficient method of analysis and on the other 
taking more observations and using a quick method of analysis. But it is not directly rele- 
vant to the choice of a method of analysis for a given body of data, because it depends in 
part on r, defined by (4), measuring the rate at which power increases with increasing NV. 
For a given problem r is fixed and so the a.R.£. can be reinterpreted in terms of the power 
attained at a fixed sample size, but it seems preferable to compare tests directly in terms 
of power. 
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Consider a test based on a statistic S normally distributed with mean H(S| A) and 
standard deviation D(S|A), where the null hypothesis is A = 0. The null hypothesis is 
rejecter. at the significance level « if 


S> E(S|0)+A,D(S | 0), (45) 
where G(-—A,) =a. (46) 
The power of the test is G[ p(A)], where 


E(S| A) — E(S| 0)—A, D(S | 0) 
D(8|A) (47) 


Now . 7 Op(A) " E'(s | 0) +A,D'(S | 0) 
p (0) = (ar), ——— “D(S| 0) ———= , (48) 


p(A) = 





In all the applications in this paper D’(S | 0) = 0, so that 





"ey. 2 te ple 
PO) = Tig yo) = FO): (49) 
Near \ = 0, p(A) = AR(S)-—A, + O(A?), (50) 


and in applications the first two terms give, asymptotically in V, the whole of the power 
curve. Moreover, R(S)~ RN-' as Noo and comparable tests of a given hypothesis will 
have the same r; hence we usually need to consider just 2. We call p(A) the power deviate 
and p’(0) the power derivative. Asymptotically the graph of p(A) against A is linear, and tests 
at different significance levels are given by parallel straight lines. Or to put the same fact 
another way, the power curves are asymptotically linear when plotted on arithmetical 
probability paper. 

Now consider the small sample theory with S possibly not normally distributed. Then if 
the power curves are plotted on probability paper they can be expected to form an approxi- 
mately parallel set of curves approaching a set of parallel lines as the sample size increases 
and the distribution of S tends to normality. This is of course only a method of presenting 
the results of power calculations, but we shall find it very convenient both in assessing the 
small-sample behaviour and in comparing different tests. 

Consider now two tests for which the asymptotic values of R(S) are R,, R,. Then asymp- 
totically in NV the power curves for a given « are two lines on probability paper, the ratio of 
their slopes being 2,/R, independent of «; both lines intersect the probability axis at a. 

A first consequence is that there is no simple general relation between the difference in 
the power of the two tests and the ratio R,/R,. If R, +R, we can, by taking « sufficiently 
small. make the difference in power between the two tests arbitrarily near unity. In practice 
we are probably only interested in 0-20 > « > 0-001, but the general conclusion remains that 
the difference in power between a quick test and an efficient test will be greatest for small «. 
Table 2 expresses this quantitatively; it shows for given R,/R, the powers of the two tests 
at the point at which the difference in powers js greatest. The values in Table 2 are in- 
dependent of V. but the values of A at which these powers are attained do depend on JV. 
This is the restriction on the alternative hypothesis referred to in §3. Thus if R,/R, = 0-7 
and a = 0-05, the difference in power is greatest for the value of A at which the power of 
the efficient test is 77% and of the quick test 51%. 
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Now consider the power of S, in small samples. Two methods can be used. The first is 
to take the expansion (50) to higher powers of A and to introduce a correction for the non- 
normality of S based on an Edgeworth expansion. This may be shown to give good results 
even for very small N, and is a general method which could be used where direct numerical 
calculation is difficult. However, for S, it is much easier to calculate the power directly from 
the National Bureau of Standards tables of the binomial distribution (1950). 


Table 2. Asymptotic theory. Powers (per cent) of quick and efficient tests 
at points at which difference in power is greatest* 








rt 0-10 0-05 0-01 0-001 
R,/R, 

0-9 67, 73 63, 71 49, 60 54, 67 
0-8 61, 74 56, 72 49, 71 43, 72 
0-7 59, 80 51, 77 42, 77 39, 83 
0-6 54, 84 47, 84 39, 86 29, 87 
0-5 48, 88 41, 89 30, 90 20, 93 
0-3 35, 96 27, 96 14, 97 7, 99 























The power was computed in this way for N = 15 (15) 135, the significance level being the 
largest value < 0-05. Under the null hypothesis the test statistic is distributed as (4+ 4)*% 
and under the alternative hypothesis as (p+q)*, where 

sae (2 == 

p=G4 | 3) 
The power corresponding to given p, 4N can be read off directly from the tables and (51) 
solved for A. The results are given in Table 3. For comparative purposes the exact power of 
the parametric test based on the regression coefficient, b, has been computed for the same 


values of VN and A. When the standard deviation about the regression line is known, the 
power is exactly G[p(A)], where 


P(A) = {igN(N?—1)}#A—A,. (52) 


To avoid rewriting the values of A in Table 4 the rows of both tables have been lettered, and 
each entry in Table 4 relates to the value of A shown above the corresponding entry in 
Table 3. 

Asymptotically, theratio of the R values of the two tests is, by (34) and (37), 4/(3 J) = 0-75; 
the interpretation of this in terms of power can be obtained from Table 2. The full curve in 
Fig. 1 and the full curves in Fig. 2 for k = 0 show the power curves for VN = 15, 30, and the 
dotted lines are the corresponding asymptotic power curves. The small-sample power is 
lower than the value given by the asymptotic theory, the difference being quite appreciable 
in the region of 80-90% power. The power curves of the most efficient test are exactly linear 
and differ from their asymptotic form only because of the very small difference between 
{N(N?—1)}* and N?. Hence the test S, is less efficient relative to b than the asymptotic 





(51) 


* These values were obtained graphically by drawing on probability paper lines whose ratio of 
slopes is R,/R, and reading off the maximum difference in probability between them. The differences 
in power are determined accurately, but it is rather difficult to find the precise point of maximum 
difference. The values in Table 2 involve R,, R, only through their ratio R,/R,. 
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Table 3. Exact power of S, test against normal regression alternatives 


Values of the standardized regression coefficient, A, are given in parentheses, and the corresponding 
power appears immediately below. 
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Sempre 15 30 45 60 15 90 105 120 135 
size (N)... 
Significance | 4.93; | 0-011 | 0-018 | 0-021 | 0022 | 0-049 | 0-045 | 0-040 | 0-036 
level a ... 
a (0-0035) | (0-0018) | (0-0012) | (0-0009) | (0-0007) | (0-0006) | (0-0005) | (0-0004) | (0-0004) 
0-035 | 0-013 | 0-021 | 0-026 | 0-027 | 0-062 | 0-057 | 0-053 | 0-048 
b (0-0178) | (0-0089) | (0-0059) | (0-0044) | (0-0036) | (0-0030) | (0-0025) | (0-0022) | (0-0020) 
0-050 | 0-023 | 0-042 | 0-055 | 0-064 | 0-135 | 0-134 | 0-133 | 0-130 
c (0-0358) | (0-0179) | (0-0119) | (0-0090) | (0-0072) | (0-0060) | (0-0051) | (0-0045) | (0-0040) 
0-078 | 0-046 | 0-091 | 0-126 | 0-154 | 0-291 | 0-306 | 0-317 | 0-327 
d (0-0545) | (0-0272) | (0-0182) | (0-0136) | (0-0109) | (0-0091) | (0-0078) | (0-0068) | (0-0060) 
0-116 | 6-086 | 0-173 | 0-245 | 0-306 | 0-508 | 0-542 | 0-572 | 0-598 
e (0-0742) | (0-0371) | (0-0247) | (0-0185) | (0-0148) | (0-0124) | (0-0106) | (0-0093) | (0-0082) 
0-168 | 0-149 | 0-297 | 0-416 | 0-512 | 0-730 | 0-773 | 0-807 | 0-836 
f (0-0954) | (0-0477) | (0-0318) | (0-0238) | (0-0191) | (0-0159) | (0-0136) | (0-0119) | (0-0106) 
0-237 | 0-244 | 0-461 | 0-617 | 0-727 | 0-894 | 0-924 | 0-946 | 0-961 
g (0-1190) | (0-0595) | (0-0397) | (0-0298) | (0-0238) | (0-0198) | (0-0170) | (0-0149) | (0-0132) 
0-328 | 0-376 | 0-648 | 0-804 | 0-891 | 0-974 | 0-986 | 0-992 | 0-996 
h (0-1466) | (0-0733) | (0-0489) | (0-0366) | (0.0293) | (0-0244) | (0-0209) | (0-0183) | (0-0163) 
0-444 | 0-544 | 0-823 | 0-933 | 0-975 | 0-997 | 0-999 | 1-000 | 1-000 
i (0-1812) | (0-0906) | (0-0604) | (0-0453) | (0-0362) | (0-0302) | (0-0259) | (0-0227) | (0-0201) 
0-590 | 0-736 | 0-944 | 0-989 | 0-998 | 1-000 | 1-000 | 1-000 | 1-000 
j (0-2326) | (0-1163) | (0-0775) | (0-0582) | (0-0465) | (0-0388) | (0-0332) | (0-0291) | (0-0258) 
0-774 | 0-914 | 0-995 | 1-000 | 1-000 | 1-000 | 1-000 | 1-000 | 1-000 
Table 4. Exact power of 6 test against normal regression alternatives 
Values of A are given in parentheses above the corresponding entry in Table 3. 
Sapte 15 30 45 60 75 90 105 120 135 
size (NV)... 
oo 0-031 | 0-011 | 0-018 | 0-021 | 0-022 | 0-049 | 0-045 | 0-040 | 0-036 
a 0-036 | 0-013 | 0-023 | 0-027 | 0-030 | 0-066 | 0-062 | 0-057 | 0-054 
b 0-059 | 0-030 | 0-056 | 0-074 | 0-088 | 0179 | O82 | 0-183 | 0-183 
c 0-103 | 0-074 | 0-143 | 0-201 | 6249 | 0-429 | 0457 | 0-481 | 0-502 
d 0-171 | 0-157 | 0300 | 0-416 | 0-509 | 0722 | 0-764 | 0-799 | 0-827 
e 0-267 | 0-294 | 0-519 | 0-673 | 0-775 | O919 | 0-945 | 0-962 | 0-973 
f 0-395 | 0-485 | 0-746 | 0-877 | 0-941 | 0-988 | 0-994 | 0-997 | 0-999 
g 0-551 0-699 0-911 0-975 0-993 0-999 1-000 1-000 1-000 
h 0-772 | 0-880 | 0-984 | 0-998 | 1-000 | 1-000 | 1-000 | 1-000 | 1-000 
i 0-879 0-977 0-999 1-000 1-000 1-000 1-000 1-000 1-000 
J 0-979 0-999 1-000 1-000 1-000 1-000 1-000 1-000 1-000 
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theory suggests. The difference is greater for smaller «. The corresponding graphs to Figs. 1 
and 2 for higher NV show that for a in the range 0-01—0-05, the asymptotic theory applies 
well for N > 60. 

The next thing is to investigate whether the form of S;, involving the rejection of the 
middle third of the set of observations, can profitably be modified in small samples. Suppose 
that (4N —2k) observations are rejected so that the number of comparisons is ($N +k); 
the exact power function can be worked out from the binomial tables as before, but an 
immediate comparison is not possible because the significance levels for different values of 
a cannot be made equal, except by the artificial device of randomized tests. However, if 
the curves are plotted on probability paper they are almost parallel for different « and an 
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Fig. 1. Power of S, for N=135. — Exact Fig. 2. Power of S, for N=30. Exact 
power. —--—-— Value from asymptotic theory. power. ———~— Value from asymptotic theory. 


Full curves are in descending order k= 0, 1, 2, 
0, 1,2, the second set having lower values of 
a than the first. Broken curve is k=0. 


increase in power with change in k would be shown by decreasing curvature. As would be 
expected, a negative k leads to a loss of power. Fig. 2 shows for N = 30 the curves for 
k = 0.1.2. There is a tendency for the curvature to increase as « decreases, but there does 
not seem to be any systematic change with k. Therefore, although an increase in k increases 
the number of available significance ievels, it does not appreciably increase power. Hence 
there seems to be little value in modifying the § rule in small samples: similar calculations 
for N = 15 confirm this. 
We have not made the corresponding investigations for S,. 


10, EXAMPLES OF USE OF THE SIGN TESTS AGAINST TREND IN LOCATION 
To illustrate the S, and S, tests, we use the figures of annual rainfall at Oxford for the years 
1858-1952, quoted by Foster & Stuart (1954, Table 9). 
For S, we compare the kth observation with the (N —k + 1)th, scoring | when the former 
is the larger and 0 when it is the smaller. The unit scores are then weighted by the distance 
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apart of the observations compared, i.e. by (N — 2k+1). In this case N = 95 and is odd, so 
that we must ignore the middle observation and proceed with N = 94. The unit scores are 
those with weights as follows: 


80. Sa. 79. 455 bie, thy OO, GR. Sis: see ke. Me By. hs, ee, ee 
The value of 8, is the sum of these weights, 1011. From (16), with N = 94, we have 
E(S,) = 1104-5, D*(S,) = 34603-75, D(S,) = 186-0. 


The observed value of S, thus represents a deviation from expectation of almost exactly 
one-half its standard error and is therefore in good agreement with the null hypothesis of 
zero trend. 

If, alternatively, we were to use the simpler S, test we compare the kth observation with 
the (3 + k)th, scoring 1 or 0 as before, but no weighting is necessary. Since N = 95 and is 
not a multiple of three, we retain the extra observations (in accordance with our findings 
at the end of §9 above) and compare each of the first thirty-two observations with the 
corresponding observation in the last thirty-two. For our sequence of scores we obtain 


1010100110000000011111111000100 40, 


the total score 8, being 14. This clearly agrees well with the expected value of 4 x 32 = 16. 
The standard error of S, is, from (33), /(} x 32) = 2-83, so that the deviation from expecta- 
tion, corrected for continuity, is just over one-half of a standard error. 


11. SIGN TESTS FOR TREND IN DISPERSION 


We consider now tests for a trend not in location but in the dispersion about a fixed location. 
For example, in a regression problem we may want to test, quickly whether the scatter 
about the regression curve increases as the independent variable increases. 

Divide the series ,, ..., 2y into sets 2%), ..., Xj} Xp41, «++, Lox3 ---, rejecting a few observations 
in the centre of the original series if N is not exactly divisible by k. The best choice of k is 
discussed below. For each set of k observations find the range, w, thus getting a series of 
ranges W,,...,W,, Where r is the integral part of N/k. The ranges are then tested for trend 
by one or other of the tests S, and S,. 

If the null hypothesis is that 2,,...,zy are independently distributed with constant 
dispersion about a regression line, w,,...,w, are independent and identically distributed. 
If the regression is not linear the w’s will be approximately identically distributed unless 
the trend within sets of observations varies appreciably. 

In the next section, a valid test is obtained for any k, and the best value of k for detecting 
certain special forms of trend is found for large samples. The behaviour for small samples 
has not been investigated. The following provisional rules are suggested: 


if N2>90 take k=5, 
if 90>N>64 take k=4, 
if 64>N>48 take k= 3, 
if 48>N take k=2. 


Except when JN is very large it is probably advisable to use the weighted, rather than the 
unweighted, sign test. 
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12. CHOICE OF k FOR DISPERSION TESTS 


To investigate the theory of the test for trend in dispersion we take a special form for the | 
null and alternative hypotheses. Suppose that 2,,...,2, are independently normally dis- 
tributed with constant mean and with standard deviations o(1), ...,7(V), where o(7) varies 
at most slowly with n. Then the ranges w,, ...,w, defined in §11 have very nearly the dis- 
tribution of ranges of k observations drawn from normal populations of standard deviations 
o, = o(4k), o, = o(3k), .... Patnaik (1950) has shown that a range of k observations can be 
represented to a close approximation as a multiple of a y-variate with suitably chosen 
degrees of freedom, v,. Therefore wjj/(wjo7) is approximately an F variate with (v,, v,) 


degrees of freedom. 
Hence _ (oo Ty) et 
prob (w,/w;> =| (Tay, 20 +2)" (l+2y% x 
1 {(%%\? 
a + ((2) = 1}, 
where P(v;,) 
Ay = Sw (49) 
‘Pane 


provided that (7;/0;)?— 1 is small. 
If we assume that the trend in standard deviation is such that o(n) = o,e”"~ o,(1+ yn), 
where yN <1, so that y is the fractional increase in standard deviation per observation, 


we have (o,/o;)?-—1~ 2ky(i—j) 
and prob (w,/w; > 1)~ $+ 2ky(i—j) A,. 


Consider first the application to the ranges of the unweighted sign test, S,. From the 
r= N/k ranges we make approximately 3N/k independent comparisons in each of which 


i—j~= 4r. Therefore if S is the total score, its mean and standard deviation are given by 


ea 8N?A,Yy 


H(S|y)=27 4ky in EK. 





D(S|7) = (3) +00. 


Therefore the power derivative, p3(0), of the test is 
E'(S | 0) 0) 8/3 NIA, 





PW) = Tso) = oak (53) 
An exactly analogous calculation for the weighted sign test S, gives 
} 
; 2/6NtA ‘ 
p;(0) ~ - cE -. (54) 


Thus in both cases the asymptotically best value of k is the one that maximizes A,/,/k. 
From Patnaik’s table of v, the values in Table 5 have been computed. 

Now the number of ranges is N/k and the number of comparisons is one-half or one-third , 
of this and is therefore small even when N is, by usual standards, quite large. Therefore it 
is advisable to use smaller values of k than the theoretical optimum in large samples. In 
the absence of an investigation of the small-sample properties of the test, the rule of § 11 
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Table 5. Determination of efficiencies of different set sizes, k, for testing trend in dispersion 








| 
| k | Axllk k | A,fJk | 
| hs ont dere AS Bae es: 3 
| 2 0-112 6 | 0-167 
3 0-141 7 0-169 
4 0-158 8 0-170 | 
| 5 0-164 9 0-169 








is suggested. This is based on the considerations that there is little gain in taking k>5 
and that it is advisable, whenever possible, to have at least sixteen ranges. 
If we substitute, in (53) and (54), the value A,//k~ 0-16, we have 


p3(0) eens 


55 
pi(0) = 0-261.N4. (55) 


It remains to compare (55) with the corresponding quantity for the maximum-likelihood 
test of the corresponding parametric hypotheses. 


13. A.R.E. OF DISPERSION TESTS 


For simplicity assume that 2,, ..., 2 are normally and independently distributed with zero 
mean and that the standard deviation of x, is 7,e”", where yN is small. The log likelihood is 


NV l 
L = —43N log (27) — N logo,—y Yn- 54 Exe. 
1 Seas : 
If we differentiate and take expectations, retaining only the terms independent of y, and 
letting N tend to infinity, we get 
aL 2N OL ) N? CL) . 
—3)~ -—:; =x }~ -— ==) ~ — 3%. 56 
2533) o (55.37 Ty” . xa) : (56) 
The large-sample variance of , the maximum-likelihood estimate of y, is given by inverting 
the Hessian matrix with elemenis (56). We get when y is small and N is large 


V (9) ~ 6/3. (57) 


Thus the power derivative of the test based on the maximum-likelihood estimate is 
pi,(0) = i = 0-408. (58) 
NV 
(58) still applies if the mean in unknown or if a linear trend in mean has to be estimated. 

From the formulae (55) and (58), and the fact that »/(0) = R(x), it follows, on using (2) 
with r = 3, that the a.R.x.’s of the tests S,, S, compared with the maximum-likelihood 
test are about 71 and 74 % respectively. 

A test entirely analogous to the above tests can be found by calculating the variances 
within each set of k instead of the range. This is slightly more efficient in the parametric 
case but much of the simplicity of the test is lost and the increase in power may be shown 
to be trivial. 
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14. EXAMPLES OF THE USE OF SIGN TESTS AGAINST TREND IN DISPERSION 


We again use for illustrative purposes the rainfall data quoted by Foster & Stuart (1954, 
Table 9). Using the provisional rule given in § 11 above, we take ranges of sets of five ob- 
servations. Since VN = 95, this gives us exactly nineteen sets, no rejection of observations 
being necessary. The nineteen ranges are: 


9-64, 12-30, 12-01, 11-45, 5-43, 13-05, 9-86, 10-89, 6-95, 15-03, 
11-34, 6-63, 12-19, 8-55, 480, 11-00, 7-76, 7:03, 10-98. 


If we apply the test S, we drop the middle value and take the signs of 10-98—9-64, 
7-03-12-30, ... down to 11-34-6-95, thus obtaining 


score: 0 1 1 1 1 l 0 l 0 
weight: 17 15 13 11 9 7 5 3 1 


The total score is therefore 58, and from (16) with N = 18 the mean score is 40-5 and the 
variance is 242-25, so that the standard error is 15-6. The deviation from expectation is 
about 1-12 standard errors, and so the two-sided normal significance level is about 27%. 
The exact significance level is, by enumeration, 73/256 ~ 28} %. 

If we use the test S, we reject the middle five of the nineteen ranges and take the signs of 
12-19-9-64, etc. This gives scores 


0 1 1 1 0 1 0 


There is clearly good agreement with an equal probability for zeros and ones; significance 
would be tested in the binomial distribution (4+ 4)’. The test S, is not to be reeommended 
in the present instance because with only seven comparisons the loss of sensitivity compared 
to the S, test would be considerable. 

Thus although there is a slight indication that the dispersion decreases with time, both 
tests suggest that this could easily be a sampling fluctuation. 


15. SEQUENTIAL TESTS 


Finally, we point out the possibility of constructing sequential tests for trend related to the 
tests considered above. While this paper was in preparation an abstract (Noether, 1954) 
appeared describing briefly a test rather similar to the one we had developed. Hence a full 
discussion will not be attempted here. However, some calculations in a special case suggest 
that the average sample size under the null and alternative hypotheses are, for the sequential 
sign tests, only a little greater than the corresponding parametric fixed sample size. 

Sequential sign tests for trend are only likely to be of practical value under rather 
exceptional circumstances. For they require that observations are sufficiently easy to 
obtain for it to be worth while to use inefficient methods of analysis, and yet sufficiently 
difficult to obtain for the saving from the use of a sequential method to be important. 
A possible application is to the marking of a large number of examination scripts. If they 
are marked in alphabetical order it may be useful to test, as the marking proceeds, for a 
trend in the marks, which would indicate a changing standard of marking. A sequential 
method is appropriate and yet elaborate calculations would be out of place. 
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16. GENERAL COMMENTS 


The calculation of the efficiency of the above tests and the determination of optimum 
weightings, etc., has been based on a particular type of alternative hypothesis. It is clear 
in a general way that the tests will remain effective for detecting monotone trends. Positive 
serial correlation among the observations would increase the chance of a significant answer 
even in the absence of a trend. 


The occurrence of ties has been ignored in the above work. A small number of ties can be 
dealt with by counting one-half a comparison in each direction, i.e. if y; = y; we calculate 
as if one-half a comparison has y; > y; and one-half has y;<y;. If a substantial proportion 


of the comparisons are ties a special investigation is necessary or the comparisons should 
be randomized. 


Estimates for the trend could be constructed from the test statistics S,, S,. It is very 
doubtful if such estimates would be of value; in any case, in much work with quick tests, 
if the trend is shown to be significant it can be estimated graphically. 
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THE VARIANCE OF THE MAXIMUM OF PARTIAL SUMS OF A 
FINITE NUMBER OF INDEPENDENT NORMAL VARIATES 
By A. A. ANIS 


Cairo University 


1, THE PROBLEM 
Consider n independent standard normal variates X,, X.,, ..., X,, and their partial sums 
S, = X,+...¢4, (¢ = 1,2,...,2). 
Let U,, = Max {S,} 
r 


denote the maximum of these partial sums. 
In a paper by Anis & Lloyd (1953) the expectation of U,, was studied and it was shown that 


fn—1 
&(U,,) = (27) Sr. 
1 


In the present paper the second moment about zero and hence the variance of U,, is obtained 
(see equation (7-1)). 

We shall always use the symbol ¢(x) to denote the probability density function of the 
standard normal variate, i.e. (x) = (2n)t+exp (— 42%), 


and F,,(x). f,,(z) to denote respectively the distribution function and the probability density 
function of U,: d 


F,(2) = Pr(U, <2), fle) = 5 F(a). 
“ H n 
We have Fy) = {(m){ Tide ax, (-o<y<eo, (1-1) 
pom 
where the region of integration K is defined by 
K: S2z,<y (r = 1,2, ...,%). 
1 
It may be deduced that = [Fal (x —t) dt, (1-2) 
0 
and that f,(z) = F,_1( ae fn-s(t) P(x —8) dt. (1-3) 


2. THREE LEMMAS ON THE F,(0) 


At this stage we state three results relating to the £,(0) which we shall need in the sequel. 
Lema 1. SUR (0) F e(O) = 1. (2-1) 


This was proved in Anis & Lloyd (1953), and is repeated here merely for completeness. 
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Lemma 2. F(0) = (2r)!/2?"(r!)2. (2-2) 


To prove this we define a generating function 


M(A) = ¥ AF (0). 


i=0 


Then, using Lemma |, it is readily seen that 
M*(A) = (1—A)-1. (2-3) 
Picking out the appropriate coefficient from M(A) = (1—A)-* gives the required result. 


Lema 3. X rF,(0) F,-(0) = 4n. 
r=0 
This foiiows from Lemma 2 on differentiating (2-3) and equating coefficients of A"—. 


3. THE SECOND MOMENT OF U,, AS A LINEAR COMPOUND OF THE F,(0) 


The second moment j,(”) of U,, is given by 


pan) = | 2%fale)de, 
and, using the reduction formula (1-3), this becomes 
pln) = Fy 00+)" |" at6e—f, a0 at 


The double integral can be integrated once, with respect to x. Using well-known pro- 
perties of d(x), and remembering that F,,_, is the integral function of f,_,, we obtain 


ptg(n) = 1+ | "falta (31) 


We now use the reduction formula (1-3) a second time, resulting in 


s(n) = 1+ F,_9(0) ["esenae+ | = | Abe — 1) fy-s(t) dt. 


The last integral may be reduced in the same way. Continuing, we find 
n—-1 
Ha(m) = 1+ 2X Ir Fra), (3-2) 


where 9, = [0 {vow P(Y2— Ys) --- P(Yp—1 — Yr) PLY) Hay, (3-3) 


4. THE COEFFICIENTS g, OF THE LINEAR COMPOUND 
We now seek to evaluate the g,. As a first step we-make the transformation 
= Yi Yin (¢ = 1,2,...,75 Yi = 9), 


, (4:1) 
or ¥,=D% (ge = 1, 2;:..,%). 


7 Biom. 42 








98 Independent normal variates 


We then have 9, = Jw fe +2%o+...+2,)? ll (x,;) dx;, (4:2) 
1 
R 


where the region of integration is 


®: ¥2z,>0 (¢ = 1, 2,...,r). 
Hence g, = L,+ H,, 
where L, = Jo] (= 2) Il P(z;)dx;,, H,= Jof(z E4,2,) ll O(x;) dx;. (4-3) 
R 1 1 2 i*-j 1 
We now consider these two integrals separately. 


5. Evatvuation or L, 


The integral L, of (4-3) is readily evaluated, as follows. Since the integrand is spherically 
symmetrical the value of the integral is proportional to the magnitude of the r-dimensional 
solid angle defined by the region of integration R; and this in turn is proportional to the 
F,(0), since by (1-1) we have . 2 

F(0) = | (r) | TT $(x,) dx, (5-1) 


R 


Let us write F%(0) to denote the integral of the integrand of F,(0) taken over the whole 
r-space, and let L? be similarly related to L,; then the proportionality of our functions F,(0) 
and L, to the solid angle R enable us to write 


F(0): F(0) = L,: L? 
whence L, = L}F,(0), (5-2) 
since F}(0) = 1. 


The value of L? is easily obtained by considering the standard integral 
(2m) [* (ry {exp —tk ¥ 2) i dx, = k-*. 
If we differentiate this expression with respect to k and then put k = 1 we find 
Lo =r. 
Hence, from (5-2), L, = rF{0). (5-3) 


The F,(0) are, of course, known explicitly. 


6. Eva.uaTion or H, 


We now consider the integral H, of (4-3). We first transform back to the original variables 
y; by the transformation (4-1). In this process the cross-product term transforms as follows: 


r-1 r r-1 
EDs, = 5 {=.22, ="S (Y:— Yess) Year 
i+j 1 i 1 


r—1 r 
= {(29.—Y-a—Youn E ui}. 
s+1 


s=1 
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provided we interpret y, and y,,, conventionally as 


Yo= Yr» Yrs1 = 9. (6-1) 
Expressing H, in terms of the y’s, and retaining these conventions, we then have 


HH, =" Kn), (6-2) 
where K,{r) = [ow [re —Yora~ Yess) B Yi i P(Ys — Yi) y;- (6-3) 


The reason for writing K,(r) in this form is that it enables us to perform one of the integra- 
tions at once, since 


x 24, — Ys—1 — Ysr1) P(Ys—1 — Ys) P(Ys— Ys+1) Ys = P(Ys—1) P(Ys41); (6-4) 


as is immediately seen on using the explicit form of the functions ¢. 
Equation (6-2) then becomes 


Kr) = I (s—1) it TT oy.- Yer) PYp-a) Uys --+ Uys 


4] (r—8) { *(Su) a (Ys41) ) I P(Ys—Yirr) WYo41---dy, (65) 


Now in this expression the (s — 1)-fold integral is equal to F,_,(0), as may be seen on applying 
the transformation (4-1) to (5-1). 


The other factor in the last expression, an (r — s)-fold integral, we call G,_,. Thus 


oe} o/k 
j< | (k) | (x u) $(Ys) BY — Yo) --- $Yn-a—Yu) H(Ye) dy, «+» dy. (6-6) 
0 0 1 


We now proceed to show that this is expressible in terms of an integral previously evalu- 
ated by Anis & Lloyd (1953): 


B= {"(6) | $y) $vs— te) --- Ber Ye) BU dy dy, = 2mH(0+1)%. (6-7) 
We use the identity Ss Y, = > $r(k+1—r) (2y,—-Y,-1 — Yrs), (6-8) 
1 1 
where, by convention, we take Yo = Yuri = 0. 


Equation (6-6) becomes 

k k-1 
Gy = BH artke—r +1) [" (| Cy, —tea— ter OY) HT Yer) HOI AN- AYe- (69) 
We now use (6-3) to carry out one of the integrations, thus reducing the k-fold integral to 


i (k- » | du P(Yy — Yo) «=» P(Yp—2— Yrs) P(Yr—-1) P(Yr41) PLY r+ — Yr+2) 


ch P(Yu-1 ai Yx) P(Yx) dy, oe dy,_4Y,.1 eee dy, 
= E,_, Ey_,- (6-10) 


Gathering the results of this section together, we have from (6-2), (6-5), (6-9) and (6-10), 


r—1 ki 
H,=2> K,(r), K,(r)=F_(0)G,_,, Ge = Lir(k+1—r)k,,H,,, EH, = (2m) 4 (r+1)4. 
s=1 r=1 


(6-11) 
7-2 
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7. CONCLUSION 


We have found (3:2) that the second moment of the maximum of the partial sums 

X,,X,+Xa,...,.X, +... +X, is 

H(n) = 1+ 2X Ie Fy—r1(0); 
r= 


n—-1 
where 9, = 1F,{0) +2 F,_s(0) G,_.. 
s=1 
n—1 n—-1r—1 
Thus Ho(n) = 1+ Y rF(0)F,,.(0) +24 ¥ F,-,1(0) F_1(0) G,_.. 
r=1 r=ls=1 


Short table of first and second moments about zero, and standard deviation, of the maximum 
of the partial sums of n independent standard normal deviates 























+ 
n KM Ay o 
3 0-6810 2-1592 1-3021 
4 0-9114 2-8842 1-4311 
5 1-1108 3-6476 1-5536 
6 1-2893 4:4367 1-6657 
7 1-4521 5-2446 1-7709 
8 1-6029 6-0671 1-8702 
9 1-7440 6-9013 1-9647 
10 1-8769 71-7451 2-0548 
ll 2-0031 8-5971 2-1412 
12 2-1234 9-4560 2-2242 
13 | 2-2385 10-3210 2-3043 
14 2-3492 11-1914 2-3817 
15 2-4558 12-0665 2-4567 
16 2-5588 12-9459 2-5295 
17 2-6585 13-8292 2-6002 
18 2-7553 14-7160 2-6691 
19 2-8493 15-6060 2-7363 
20 2-9409 16-4989 2-8018 
21 3-0301 17-3946 2-8659 
22 31171 18-2928 2-9285 
23 | 3-2022 19-1934 2-9899 
24 3-2854 20-0962 3-0500 
25 | 3-3668 21-0010 | 3-1090 





The second term on the right-hand side equals 4(n — 1), by Lemma 3, and the third term 


n—2 
reduces, with the aid of Lemma 1, to > G,. So we finally obtain 
1 


n—2 
p(n) = 4(n+1) +25 G,, 


Ror 
. a (r—9F —% 
where, by (6-11), G, 5° on Py {j(r —j+ 1}. 





te 
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Finally, then x(n) = ¥(n+1)+ map > {s(r —s + 1)}-+. (7-1) 
r=l1s=1 


From the point of view of numerical evaluation it is fortunate that the individual terms 
of the summand in (7-1) are independent of n; computations carried out for a given value 
of n can be immediately utilized for larger values of n. A short table of specimen values is 
appended. 

Values corresponding to very high n may be approximated by a result of Erdés & Kac 
(1946), who gave the limiting distribution of @, = n-* U,, as 


lim Pr {0,, <a} = J [“exp(— 4x) dx (a >0) 


n> 7J0 
='¢ (~<0). 


The limiting second moment of @,, is thus unity, and the asymptotic second moment of 
U,, is n. 

Our results are in agreement with this. We may approximate the double sum in (7-1) 
by the double integral 


n—# fy 
| | a-t(y+1—2)-*dady. 
y=t Jxr=t 


If we reverse the order of integration, this may easily be evaluated explicitly to give an 
expression which, to terms of order n?, reduces to 


nm — 2(2 + /2) nt. 


2+ /2 
Thus pawn Jn. 


For n = 25 this gives 3 ~ 19-6; the correct value is w, = 21-001. 


The author wishes to acknowledge his debt to the referee for the asymptotic results 
given in the last paragraph. 
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SPATIAL POINT PROCESSES, WITH APPLICATIONS TO ECOLOGY 


By H. R. THOMPSON 
Applied Mathematics Laboratory, D.S.I.R., Wellington, New Zealand 


1. INTRODUCTION 


The term ‘point processes’, referring to stochastic processes in which events occur at more 
or less irregular intervals and which are represented by points on the time-axis, is of com- 
paratively recent origin, although the existence of such processes has in fact been well 
known for a long time. They have been discussed fairly extensively in such diverse applica- 
tions as the counting of radioactive impulses, telephone calls and cases of contagious 
diseases. Wold (1949) developed a statistical theory for treating processes of this type, and 
also mentioned briefly how the events could take place in a two-dimensional or higher field. 
Such a generalization, from events with no time extension to those with no ‘space’ exten- 
sion (i.e. specifically of a point character), has a suitable field of application in the ecological 
study of the distributional pattern of plants. If we can assume to a first approximation that 
the plants have the dimensions of a point, then we shall see that it is possible to discuss 
precisely probability relationships between the numbers of plants in different areas of the 
region under investigation. 

The main aims of quantitative ecology are the precise description of a community of 
plants with interpretations in terms of the biology of the species, and the correlation of 
vegetational and environmental data, and ecologists have used several methods in an 
attempt to achieve these aims. In most of the initial work on field sampling for ecological 
data, the procedure was to take ‘quadrats’ (sample areas small in relation to the total 
area of the region) scattered at random over the area, and study statistics derived from the 
frequency distribution of the numbers of plants per quadrat. While this approach is useful 
to some extent, in that any given type of distribution function may be fitted to the data, 
it does not necessarily furnish the kind of information required by an ecologist. It will 
not give any evidence of trends, or indicate the pattern of the distribution over the area 
or the way in which this pattern may have arisen, all factors of prime importance in the 
study of the structure of a plant community. We only have to cite the negative binomial 
distribution, which is known to arise in at least four different ways, all based on widely 
differing assumptions, to illustrate this point. 

In recent years ecologists have become aware of the need for a more satisfactory approach 
to the problem, and Greig-Smith (1952) provided a potentially great advance on the statis- 
tical side when he recommended the use of a grid of contiguous quadrats over some portion 
or portions of the region. The advantage, of course, in arranging the quadrats in a grid is 
that the analysis of variance technique may be employed, either for the detection of trends, 
or, more importantly, for the detection of a mosaic variation in density (due to ecological 
causes connected with the spread of the plants) by a ‘nested sampling’ type of analysis of 
variance, associating the quadrats into successively larger blocks and comparing the 
component block variances. The details and applications of this method are described at 
length by Greig-Smith, together with the results from sampling experiments on artificial 
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plant communities. We are not concerned here with discussing the ecological implications 
of the method (for which the reader is referred to a paper to appear elsewhere (Thompson, 
1955)), but with the application of point process techniques to deriving the probability 
relations between the numbers of plants in the quadrats of a grid, required particularly 
in the analysis of variance and its sampling theory. 

Wold’s treatment of one-dimensional point processes is based on the distribution of the 
time interval between successive events, which could be either independent, or dependent 
on the previous interval, in which case the occurrence of an event would depend on the times 
of occurrence of the two preceding events. Given A(x, y) dt, equal to the joint probability 
that an event has occurred in (¢,t+dt) with the two immediately preceding events at in- 
stants ¢,, t., where x = t—t,, y = t,—t,, Wold derives integral equations for the distribution 
functions F(x, y), which is the conditional probability that an event at t, is followed by an 
interval (t,,¢, +2), given that the immediately preceding event occurred at t, —y, and G(x), 
the absolute probability that an event at t, is followed by an event in (t,,¢, +2). From these 
he obtains the probability S(v, 7') of exactly v events in an arbitrary interval of length 7. 
The definition of A expresses the fact that the process is stationary, for A remains the same 
if the set of events undergoes a translation along the time-axis. 

This type of approach is adequate for one-dimensional fields because, by means of it, a 
process in which the course of events depends on all the prehistory up to a given time can 
be completely specified (apart from random variation); but it does not easily extend to 
more than one dimension. It seems in fact impossible to specify a stochastic process in 
two dimensions by means of a similar dependence of events. The difficulty is now (at least 
in the ecological application) that we are studying a function F(x, y, 7’) say of the two 
space variables x, y and the time variable t, for a given value 7 of t, and we are given, 
specified by means of their space co-ordinates, a set of events which may have occurred 
separately at any time. Although the development of the process occurs along a time-axis, 
with events (i.e. new plants) occurring at specified times, it is mainly the consideration of 
the spatial pattern with which we are concerned. 

For the purpose of finding probability relations between the numbers of plants in neigh- 
bouring finite areas, it is convenient to use the method of ‘continuous parameter’ stochastic 
processes, developed in connexion with physical problems such as electron cascades, the 
main contribution being jointly by Bhabha (1950) and Ramakrishnan (1950). They con- 
sidered the case of particles with specific energies in a continuous range, i.e. effectively 
point processes in one dimension. Probability relations between the numbers of particles 
in non-overlapping infinitesimal ranges are defined by density functions of different orders, 
called by Ramakrishnan ‘product densities’, and the integrals of these over finite ranges 
yield linear functions of the product moments of the total numbers of particles in the finite 
ranges. The required mathematical treatment is obtained by considering a function of the 
continuous parameter EL, n(Z) say, which equals 0 everywhere except at a finite number 
of points Z,, H,,..., #), say (at which events occur), where n(#) = 1. With suitable modi- 
fications this is the approach adopted in this paper, and in the next section the theory of 
spatial point processes is recapitulated in the most convenient manner. As in Wold, we 
assume that the processes are stationary in the statistical sense, the essential character 
being unaltered by any translations of the axes. This is a quite reasonable assumption for 
plant communities, where trends can often be assumed to be absent. 
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2. SPECIFICATION OF POINT STOCHASTIC PROCESSES IN TWO DIMENSIONS 


The distribution of plants over a region is described by a continuous parametric system, 
whose states are labelled by the two continuous variables x and y, the co-ordinates of 
position of a plant with respect to a given origin. Thus a given pair of values (x,y), which 
will alternatively be denoted by the parameter A, regarded as a vector, is taken to imply 
the occurrence of a plant in the infinitesimal rectangle with lower left-hand corner (z, y) 
and sides of length dx, dy. The infinitesimal element of area dxdy will likewise be denoted 
by the vector dA. A realization of such a system will consist of a number of isolated points, 
k say, denoted by the parameter values A,, Ag, ...,A,. All states of the system are covered 
by allowing each A; to range over the whole domain of A, and letting k = 0,1, 2,..., to give 
all possible samples. With each possible sample is associated a probability I1(A,, Ag, ..., A,), 
and a rigorous treatment of continuous parametric systems is obtained (see Bhabha, 1950) 
by a precise mathematical definition of the probability II, using set theory. For the purposes 
of this paper a more direct approach, as given by Ramakrishnan (1950), is sufficient. 

Let N(A) denote the number of plants whose positions are below and to the left of the 
value A = (x,y), i.e. whose parametric values are less than A. Then we can consistently write 


N(A) = | “ dN(A), 


and regard dN(A) as denoting the number of plants in the infinitesimal rectangle, area dA, 
situated at A. Assume that the probability of one plant in dA is mdA, and the probability 
of more than one is 0(dA). Hence dN (A) is a variable assumed to take only the values 0 or 1, 
and it follows that all moments of dN (A) are equal to the probability that it takes the value 1. 
Also the consideration of probability relations between different dN (A) is very much simpli- 
fied. for a contribution to the product moment H{dN(A,)...dN(A,)} will only occur when 
there is a plant in each dA,, and the contribution is then simply unity. If any one of the 
dA, contains no plant, the contribution is zero. 


The probability relations between the numbers dN(A) in the small areas dA are defined 
by functions of the form 


fr(Aqs Ag, «+-,-4,)A,dA,...dA, = E{dN(A,)dN(Aq)...4N(A,)}, (2:1) 


which represents the joint probability that there is one plant in dA,, one in dAg, .... one in 
dA,, when dA,,dA,,...,dA, are all separate non-overlapping areas. f, is called a product 
density of degree k. f,(A)dA = E{dN(A)} = mdA, and it may be noted that the integral 
of f, yields only the mean number of plants in the area of integration, as the addition rule 
of probabilities does not apply, the events in general not being mutually exclusive. When 
two of the dA, are the same, a product densiiy of degree k becomes one of degree k—1. 


For if dA, =dA,._,, 
E{dN(A,)dN(A,) ...dN(A,_,)?} = E{dN(A,)dN(A,)... dN(A;,_,)} 
= Sr-(A, A,, sees A;_1) dA,dA, eee dAj,,_;. 
It is degeneracies of this type that lead to the following theorem, given by Bhabha, for the 


product moment of the numbers N(A;)—N(A;,), denoted by Nj, in finite areas [A;, A;] 
(¢ = 1,2,...,&). A; is the lower left-hand point, A; the upper right-hand point of the area, 
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which is assumed to be rectangular. The k areas denoted by the numbers [1], [2], ..., [4], 
and any number of them may be the same area: 


E(Nq) Moy --- Nya} = | f,(A)dA +E I i fol Ay Ay) dA, dA, 
ey (1, ..., 7) J (r+1, ..., Kl) 


+= i | i fg(Ay, Ay, A3)4A,dA,d A, 
(1, ...,7)/ [r+1,...,8]/ [s+1,..., Kl] 


+4] | fe aga, a, (2-2) 
{uJ [2] {kl 


where 


(1) by [1,....7] is meant the equality of the r areas [1], ..., [7], if in fact they are equal. 
If they are not, the terms containing [1, ...,7] do not appear in the expression; 

(2) the summation sign before each term denotes a summation over like terms in which 
the integers 1, ..., are distributed in all possible ways between the brackets affixed to the 
integrals, there being no distinction between the order of the integers in a bracket or between 
the order of the brackets. For example, with & = 4, integrals of the type of the second term 
are summed with the groupings of the integers given by 


[1] [234], [2] [841], [3] [412], [4][123], [12] [84], [13][24], [14][23], 
of the third term by 


[12134], (1E3}(24], (C1E4I(231, (2113)004), (21041013), (3714) 1221. 


Thus, if the four areas are the same area [1] say, we have for the expectation of the fourth 
power of the number in [1] 


ness ={nr7f fneof f fof ff f te 
/ fi) (uJ (1) fd WJ fi Gd fd tu) 


using a condensed notation. If there is more than one area to consider, the expression is 
modified; for example, 


B{NEy Ni} = | { fat | i | fat | i | fat] | i | Sa 
(a)J (2) tJ (1) J (2) (uJ (2). (2) (uJ (i). (2) (2) 


The product densities may consistently be regarded as factorial-moment densities (cf. 
Bartlett, 1954). For, if we consider the expression for Z{Njj}, the coefficient of the term 
containing the product density of degree s is the sum of the coefficients of all terms 


x... 2dN(A,)"dN(A,)*...dN(A,)* 
s 
for a given s and all different combinations of a, %, ...,%, subject to the condition > a; = r. 
i=1 


That is, the coefficient of | ...]| f, in E{Nfj} is Ur!/(a,!a,!...,!) the summation being 
(i) (1) 


over all possible different combinations of the sx;. Stevens (1937), in a different connexion, 
proves that this coefficient is of the general form A*‘0"/s! = C% say, where A‘%0" (s = 1, 2, ..., 1) 
is the sth leading difference of the rth powers of the natural numbers. Thus we have 


Ba} = C5) AlAdd+C5] | held Addl, 
1 


+...405f Le Pitas... aaa). dA, 
i) 


vl ao 
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One of Stevens’s results is, in our notation, 
Nin we CUM e CENa(Mn “y 1) +. + Cr Nai(Nan ep 1) eee (Nau Pap 1). 


A simple inductive argument then shows that 
{ fae [ fal .-+»A,) dA... dA, = E{Ng (My — 1)... (My —& + L}- 
1 + L 


For a process to be stationary we require that the covariance of the numbers in two areas 
is independent of their absolute position. Thus, for two areas [1], [2], 


cov (Ni, Noi) = E{Ny —K(2 iix1)} {No} — E(Nai)} 
a E{Ny Nei} = E{Nu} E{N3} 


: i) } (fo(A,A2)—m*}dA,dA,, from (2-2). 
Ue th 


This implies that fo(A,, A.) —m? = w(A,—A,). 


so that f,(A,, A.) is a function of the differences x,—2,, y.—Y}. 

In practical applications the calculation of the product density of degree k will generally 
involve a tabulation of the different ways of getting a plant in each of the k infinitesimal 
areas, due usually to the presence of different classes of plants (different groups or families 
of plants, different generations, etc.). f, is then obtained as the sum of the (mutually 
exclusive) probabilities for all these possible alternative cases, according to the addition 
rule of probabilities. 


3. ANALYSIS OF VARIANCE ON A GRID OF QUADRATS 


The individual terms of the analysis of variance being considered in this paper are the sums 
of squares within blocks of a given size and between blocks of the next lowest size. The most 
useful type of grid to employ is one in which the number of quadrats is a power of 2, leading 
to blocks of 2,4,8,... quadrats and consequently more information, relative to the size of 
the grid, than any other arrangement of blocks. The total size of the grid is quite arbitrary; 
for practical and theoretical purposes it has been fixed here at 256 quadrats, arranged in a 
16 x 16 square. With a suitable choice of quadrat size, the important ecological effects 
should be found in the terms for the smaller-sized blocks, which have a fairly large number 
of degrees of freedom and are consequently subject to less fluctuation. 

From the definition above, we see that if B,,,, denotes the total number of plants in the 
ith block of k quadrats, then S,,, the sum of squares within blocks of k quadrats and between 
blocks of $% quadrats, is given by 


9 512/k 1 256/k 
S, = k= Bian k= Bia; (3-1) 

i=1 i=1 
from which S, has n, = 256/k degrees of freedom. This is ordinarily the most convenient 
practical way of calculating individual terms, but for deriving an expected analysis of 
variance and its sampling errors for any given theoretical model we need to work in terms 
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of the numbers in a single quadrat. Consider now the moments of S, fora given k (k = 2, 4, 8, 
..., 256). A three-suffix notation for the quadrat numbers is the most convenient for this 
purpose. 

Let N,,, be the observed number of plants in quadrat (ijl) of the grid, where i (= 1,2,..., 
256/k) denotes a primary block of k quadrats, j (= 1,2) a subblock of 44 quadrats, and 
1 (= 1,2,...,4%) a quadrat of a subblock. We assume that blocks of the same size always 
have identical shapes. For k = 2?”+® (p = 0,1, 2,3) blocks are taken to be squares of sides 
2°+1 quadrats, while for k = 2?”+1 they are assumed rectangular, of dimensions 2? x 2?+1 
quadrats, with the longer length always in the same direction. 

From (3-1), transforming B,,, to a sum of the N;, we obtain 


2 
b= ZUlaN ia)” + (% Nizs)* (Xz Miz)" 
a 7 
ye SLE (Naural (3-2) 


From (3-2), by long and tedious but otherwise simple algebra, the higher powers of S,, may 
be obtained. We quote only Sj in full: 


1 LOE 7 E 
St = pglE{Z (New Ned}*+ ZY (EN Nad}? Near Nea}") 
Ji. "a l 1 
1 = 
=prxri (a Niat4 = Nin Nir — 4% Nia Nise 


+3 Ni iv +3 UNiaNise 


+6 y™ Nii Ni Bi 2 Ni, Niyr Niyr 
l+U +" 
—12 Py Nii w Niger at a aN ™ Niger 


+3 & 5 ees LD NaN Nix Maer} 
lel, #0" lel'40 


=] be aw 9 
py a (eNinNive +2 2. Ni aNege Ney 
3,7 


=~ 
Nw 
+ 
~ 


, 72 _ 7? yy ‘4 
=< * N Vi ‘Vv N;, sv RE Dy Nia Nie Negr Never 


U0 141’, 0 +1" 
9 x ) T oa oa , ry AT . oa ’ $ 
+2 0 NaN Nae Nyr-8 SX Nia Nir New Newt, (3°3) 


“uv, 1,U’; +0" 


where i,7’: j.j’; 1,l',1",l” are varied independently over their whole respective ranges unless 
otherwise stated. 

For the mean and variance of S,, we take expected values of (3-2) and (3-3), and see that 
in the first case 


256 
E{S,} = 72 (een yg) +23 BUN igh 2S ENN (34) 


and is only a linear combination of terms like H{N?,}, HLM) Moi}, while Z{S}} will be a linear 
combination of terms like H{N},}, H{N}, Nig}, L{NA, Ny}. LENA NM}, {Mu Mo Mo Mat- 
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The expressions for these expectations, obtained from (2-2), are given below, with 
Si(Ay, .--» Ay) dA, ... dA, written f,, for short: 


E{Niy} - fi +| Se» E{Nuy Nai} -| | Se» } 
fi) fi)J (1) (1) (2) 


simacfweit facettaheet tal fi 
163] fl i) (uJ (J (1) (uJ (1)J (J (1) 
E{Niy Nai} -| I f+3| { i fs+| } | | fa 
(1) J (2) (uJ (1) [2] fi) (uJ tJ (2) 
nono ffusefalrel doled 
(1) J (2) (1) J (1). (2) tJ (2) / [2] ~ (i) J (1) J (2) J [2] 
ENN = { [t+] [ff te 
~ (1) J (2) {3) (1). (1) (2). [3] 
BUN NaNaNad =| { {| fo 
(1) J (2). (3) (4) J 


4, APPLICATION OF THE THEORY TO AN ECOLOGICAL MODEL 


| (35) 





The simplest assumption mathematically that can be made about a community of plants 
is that they are distributed at random in the Poisson distribution. Such a distribution might 
conceivably arise when an area is first invaded by wind-borne seeds, but it is hardly ever 
encountered in practice. A more realistic and more useful model is obtained by allowing 
plants distributed in this manner, i.e. at random, to become the centres of distribution 
(‘parents’) of a generation of ‘offspring’ plants, whose positions depend on those of their 
respective parents. The offspring of a given parent are assumed to be distributed independ- 
ently of each other, the distances r from parent to offspring following the isotropic normal 
or f(r) dr = e-¥** dr|(2n02), (4-1) 
and the numbers n of offspring per parent following an arbitrary probability law p(n). 
We consider for simplicity the distribution of the offspring only, for then the only distances 
which have to be taken into account are those between offspring of the same group. and 
these are also isotropic normal, but with parameter ,/2 0. For, ‘(r) dr is the product of two 
independent components e~#**/* dz/,/(270), e~*v*/** dy/,/(2707) along rectangular axes, and 
the distance between two offspring projected on either of these axes is merely the sum of 
two independent distances (from the parent) following a one-dimensional normal 
distribution. 


The product density /,, of degree k is found (§ 2) as the sum of mutually exclusive pro- 
babilities, denoted by Pr(dA,,...,dA,), for all the possible alternatives with k different 
areas dA,,...,dA,. Here the contributions to f,, will arise from all different combinations 
of k plants from s groups (s <k) such that there are a, plants from the first group, ..., %, 


from the sth with > a; = k. Of these we need only consider the case s = 1, i.e. all k plants 
i=1 


from the same group; the other cases will merely result in products of lower order product 
densities since the contributions from different groups are entirely independent. We 
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prove now that the joint probability of k offspring of the same group occurring in 
dA,,...,dA, is 


Pr (dA,,...,dA,) = mp E{n(n—1)...(n—k+ 1} G(Ay, ..-,4,)dA,...dAy, (42) 
l k-1] k-1 Ri, 
where (Ay, ..., Ay) = (sca) jexP|- } ya non ‘ 


i=1 j>it 


(4-3) 
Ri, = (%;— 2; + (Yi - yy), 
and mz, is the mean density of the parent plants. The mean density of offspring em 2 is 
m = mE {n}. 
Assume that the position of the parent plant of a group with a random number n of 
offspring is at dAy. The probability of this is 


md Ay. 


The probability of k (<n) offspring of this parent in dA,,...,dA,, distant Ro,, ..., Ro, from 
dAg, given the parents’ position is at dApo, f 


1 \é k Ri, 
i= 


since there are n possibilities for dA,, n—1 for dA,,...,.»—k+1 for dA,, and each distance 
is distributed independently. Therefore, averaging over n, 


Pr (dAy,d Aj, ...,dA,) 
= Pr (dAy) Pr (dAj, ...,dA,, | dAg) 





= mo {n(n 1)... (n—k+1)} (575) exp| - 5 tone +(Y9—- Y:) “|a, dA,. dA, 


i=1 2o° . 





1 —Z)? —7)2 
= myH{n(n—1)...(n—k+ Dp g_qaexp | — eS 2) Wen) 24, 





k-1 k-1 om 
« (see) exp| - y zr (x; a;)? *+ (Yi — wr dA... dA,, 


k Xs k Yy; 
where Z=>-, g=>d>>=. 
Ak? I-A 


Integrating out with respect to dA,(= dzydy,) over the whole region (assumed infinite), we 
obtain 14-2). ¢(A,,...,A,) may be shown to be a product of two (k—1)-variate normal 
distributions (in z and y), the k—1 variables being, for example, the differences of the first 
k—1 values of x (or y) from the kth, so that each variable is distributed normally, mean 
zero, variance 20%, and the correlation between any two Variables is }. 

For the mean values and variances of the set of sums of squares we require the product 
densities of second, third and fourth degree. With two areas dA, and dAg, there are only two 
possibilities to consider; both areas might contain plants from different groups, in which 
case the joint probability Pr(dA,,dA,) is m*dA,dAg, or else the plants may be offspring 
from the same group. We note that the case where one of the dA contains no plant is 
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automatically excluded, since it makes no contribution to the second degree product 
density. For the second case, we have from (4-2) 


Pr (dA,,dA,) = mE {n(n — 1)} 6(A,, A.) dA,dAyg, 
so that, adding the two mutually exclusive probabilities, we obtain 


fe(A1, Az) dA, dA, = [mH {n(n —1)} f(A,, A) +m?]dA,dA, 





= (22 ge Rstet+-m) dA,dA,, (4:4) 
from (4-3) where 9, = E{n(n— 1)... (n—r+1)}/E{n}. (4:5) 


Now tabulate the possibilities for three plants in three different areas dA,, dA,, dA3. 
If the plants are all in different groups, 


Pr (dA,,dA,,dA,) = m3dA,dA,dAy. 


If dA,, dA, are in a group G, dA, not in G, then since the contributions from different groups 
are independent, we have 


Pr (dA,, dAg, dA3;) = mE {n(n = 1)} (Ay, Ay) dA,dA, .mdAg. 
If dA,, dAg, dA, are all in the same group, 
Pr (dA,,dA,,dA3) = my H{n(n— 1) (n— 2)} 6( Ay, Ag, Az) dA, dA,d As. 
This exhausts all the possibilities; therefore, 
f(A, Ag, Ag) = m> + m7go{h(Aj, Ag) + (Aj, Az) + G(Ag, Ag)} +93 9(Aj, Ag, As) 
= m3 + m%go(e—Fia/4o* + e—Ris/4o* 4 ¢-Rhy/4o*) 47772 
+ mg, e~Piat Rist Ris)/60? |] 2772or4, (4:6) 
We omit the tabulation for the fourth degree case, as it is obvious from the expression 
Sul Ay, Ag, Ag, Ag) = m4 + m*Qo{h(Ay, Ag) + O(Ay, Ag) + P(A, Ag) + G(Ae, As) 
+ $(Ag, Ay) + P(A3, A,)} 
+ m*g3{p(A,, Ay) $(A3, Aq) + $(Ay, Ag) P(Ag, Ay) 
+ P(A, Ay) (Ag, As)} 
a m9,{p(A,, Ay, As) + (A, Ag, A,) te P(A, As, A;) + P(Ag, As, A,)} 
+mg,$(A,, Ag, Az, Ay). (4-7) 


The product densities are integrated and substituted in (3-5) to give the expectations 
needed in the calculation of Z{S,}, H{S?}. An explicit solution has been found for the in- 
tegral of f, only, but product densities of all degrees may be integrated numerically. Let 
the quadrats of the grid be squares of side 4, and write 


J JosS yo 4” ry ew gee eee (48) 
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where square brackets round each suffix of J may be assumed if desired. Then equations 
(3-5) become 


B{NA} = mh?+ mbrgg],,  E{Ny Na} = mbga i, 
E{NGy} = mh? + Tht + 6mFhS + mths + mh2(7 + 18mh? + 6mPh*) go 1) 
+ 3mh4g3 I3, + mh?(6 + 4mh?) 9,14, + mh?gy hii, 
E{N#R Mai} = mht + 3mFho + mAh’ + 3m7h4(1 + mh?) go L,, + m*h*g, 11, 
+ mh?(1 + 6mh? + 3m?Ph*) go Tie + 3m*h4g3 Ly Le 
+ 3mh?(1 + mh?) 95 Ly + Mh? gy Lyry0, 
E{Niy Nin} = mht + 2mFhs + mth? + 2mPhA(1 + mh?) gol, 
+ mh?(1 + 4mh? + 4mPh*) 911. + m*h*g3(1?, + 2122) r (4-9) 
+ mh?(1 + 2mh?) gg(Ly12 + Loe) + Mh? G4 L120, 
E{N2y No Nei} = m2h® + mth + mh®go Ly, + (mPh* + 2mh®) go(Iys + Lys) 
+ (m®h* + mh®) go Ing + m*h*g3(L, Log + 24,243) 
+ (mh? + 2m?h*) gs Log + MA*gs(Ly12 + L433) + mh? G4 L103, 
EX Ny No Nig) Nai} = m*hs + moh®go( Tyo + Lys + Lyq + Log + Log + Iga) 
+ mH g3( Tyo 154 + Ly Loe + Lites) 





+ m*h8gs(Lyo3 + Loa + Liza + Lega) + A794 Lose. J 


The details of the calculations are omitted, as much tedious manipulation is involved. 


Table 1. Expected mean squares for the non-random model 








k 2 4 8 16 32 64 128 256 
Ny 128 64 32 16 8 4 2 1 
E{S),/mh*} 1-12 1-19 1-64 1-84 2-40 2-58 2-91 3-02 



































There are many different numerical examples possible, since there are two sets of para- 
meters to be varied, those of p(n) and those of f(r). A study of the effects of different func- 
tions p(n) and f(r) on the form of the distribution of Nj, the number of plants in a single 
quadrat, has already been made (Thompson, 1954). Table 1 gives the expected mean 
squares H{S;,/mh*}, (S;, = S,/n,), for the particular case o = h/,/2 (which gives a simplifica- 
tion of the numerical integrations) and a binomial distribution for p(n) with a mean of three 


offspring per group. The parameters in p(n) = (7) p™(l—-p)* (n=0,1,...,N) are 


N = 6,p = 4, andg, = (N—1)(N—2)...(N—r+1)p’—. The set of expected mean squares 
E{S;,} has upper and lower limits, reached as o tends to 0 and oo respectively; they are 
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mh?(1+g,) and mh?, independent of k. These are both mean squares for a Poisson dis- 
tribution, and are to be expected, since in the first case each offspring’s position coincides 
with that of its parent, and in the second case there is effectively no clumping. 

The standard error of a sum of squares is not in itself a very useful statistic, because of 
the skew distribution of S,, at least for a small number of degrees of freedom. For the 
first three mean squares of Table 1. with fairly large , (and for which normal approximations 
might be expected to hold), the standard errors are, respectively, 0-20, 0-27, 0-46 for mh? = 1, 
and 0-13, 0-16, 0-26 for mh? = 4. A more useful practical method of studying the sampling 
variation is to determine the limits of error of individual mean squares, so that the set of 
expected mean squares for a given model has appropriate to it ‘significance bands’ outside 
which terms observed from samples of the model would only be expected to fall with a given 
(small) probability. In this approach we ignore the fact that a real correlation exists between 
any pair of mean squares, except for the case when the N,,, are independent normal variables, 
and calculate limits of error on the assumption of approximate y? distributions for individual 
mean squares, based on their first two moments, since we know that for Nj, normal (variance 
o* say) S,/o? is distributed exactly as x? with n, degrees of freedom. Table 2 gives these 
limits of error for the case mh? = 1 only, but to a first approximation they may be taken to 
apply to mean squares S},/(mh?) so long as mh? is not too different from unity. The bands are 
symmetrical, in the sense that the probability of an observed mean square being above the 
upper limit equals the probability of its falling below the lower limit. 


Table 2. Approximate p% significance bands for the mean 
squares of Table 1 (mh? = 1) 











k 2 4 8 16 32 64 128 256 

95 % band 0-76 0-72 0-86 0-72 0-62 0-28 0-06 0-00 
1-55 1-78 2-65 3-50 5-38 7-28 10-83 15-22 

80 % band 0-86 0-86 1-07 0-99 0-96 0-62 0-26 0-03 
1-39 1-55 2-25 2-78 4-08 5-08 6-72 8-18 



































It is instructive to compare these figures (applying to a model with not a very high degree 
of contagion) with the results for a purely random distribution (parents alone, say), which 
may be shown to approximate very closely to the normal case. The correlations between 
individual mean squares are in this random case almost negligible, and S,./mh? follows almost 
exactly a x? distribution with n,{2kmh?/(1+ 2kmh?)} degrees of freedom. For a Poisson dis- 
tribution, we have explicitly E{S',} = mh? 

var {Si} = mh%(1 + 2kmh?)/256.| ae 
Table 3 gives the limits of error for the case mh? = 1, but as in Table 2 they may equally 
well apply for other values of mh?. In fact, for mh = 3, 00, which are symmetrically placed 
with respect to mh? = 1 here, the differences from the values given in the tables are never 
more than + 0-02. 
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Table 3. Approximate p % significance bands for the mean squares 
from a Poisson distribution (mh? = 1) 














k 2 4 8 16 32 64 128 256 

95 % band 0-75 0-67 0-56 0-42 0-27 0-12 0-03 0-00 
1-29 1-40 1-56 1-82 2-20 2-79 3-70 5-03 

80 % band 0-83 0:77 0-69 0-58 0-43 0-26 0-10 0-02 
1-18 1-25 1-34 1-48 1-68 1-95 2-30 2-71 
































5. Discussion 


When an analysis of variance is carried out on a set of observational data, the usual aim is 
to test the homogeneity of the set of variances obtained, employing the F test. With a 
grid of quadrats, the appropriate method would be to test the ratio (S,,/n,)/(S,/n2), which, 
on the null hypothesis that the numbers J, in the quadrats are independently and normally 
distributed with the same variance, follows the F distribution with n, and n, degrees of 
freedom. The conditions for the applicability of this test will not, however, generally hold 
in the present ecological application, for the associations of plants in clumps result in the 
non-independence of the numbers in neighbouring quadrats, and even more distant quad- 
rats if the clumps themselves are related, and the effect of clumping is usually to produce 
a contagious distribution of the quadrat numbers. Under the usual null hypothesis, H) say, 
we have Pr{F > F, | Hy} = a, (5°1) 
where F = (S,/n;.)/(S,/n2), and F, is the value of F on n, and n, degrees of freedom at the 
100a % level of significance. The expected value of F on this hypothesis is unity; however, 
for most non-random models, including the example above, it is greater. (It may occasion- 
ally be less than unity.) Therefore, to study the power function of the F test we should 
ideally calculate the probability 

Pr{F/B(F)> F,| H} = 8, (5-2) 
where H is the non-normal hypothesis relating to the particular model, and compare 
B with a. 

This has been done approximately for the example of § 4, for the two cases k = 4,8 and 
with « = 0-05, 0-01, the results being given in Table 4. The method used is one recommended 
by David & Johnson (1951). Rewrite (5-2) in the form 

Pr {S,—aS,>0| H} = £, (5*3) 
where a = n, fF, E(F)/n, = 2F,E(F)/k. The first two moments of S = S,— a8, are sufficient 
to find this probability approximately, transforming S so that probability levels for the 
x? distribution can be used. A new statistic S/(mh*)+C is formed whose first two moments 
are exactly those of a y? distribution with f degrees of freedom by taking 

C = 4var{S/(mh?)} — E{S](mh?)}, f = 4var{S/(mh?)}. 
Then Pr {xj > C} furnishes £. The method is highly satisfactory when the N,,, are Poisson 
or normal variables. These results show that the actual level of significance being used is 
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Table 4. Values of 8 (from (5-3)) for the model of § 4 




















a= 0-05 a-0-0l 
. mh? | | 
Pe j | 1 20 | j | 1 x 
a Lt ey + | baad 8h 
| | 
4 | 0-090 0-073 0-055 | 0-033 | 0-023 0-013 
st 817; > eis 0-089 | 0048 | 0-038 0-028 
' | i 








quite close to its assumed true value. However, we have assumed on a priori knowledge of 
E{F}, which would not normally be so in practice. A truer picture of what would happen if 
the F test were applied indiscriminately is given in Table 5, where we have calculated 


Pr{F > F,| H} = p’. (5-4) 


In this particular example, we see that the use of the F test on unadjusted variances is of 
no value at all for & greater than 4, and even for k = 4 the differences between /’ and a 
are becoming appreciable. 


Table 5. Table of B’ (from (5-4)), for the model of § 4 























| a= 0-05 a=0-01 
a 
“ mh* | 
\ 
kw 3 l co 4 1 oa 
~| 
' r 
4 0139 | O19 | «0098 | 0-054 0-040 0-027 
8 0-535 0-537 | 0-539 | 0350 | 0342 | 0334 | 
| | j \ 








We have only discussed one particular example in this paper. Several others giving rise 
to non-random distributions of plants have been derived, consistent with the mathematical 
and computational difficulty involved. They are described at length in Thompson (1955), 
with more emphasis on the ecological application, however, so that it may be of interest to 
note some of them briefly here. We can extend the model of § 4 to include several generations 
of plants, the distances from each offspring to its parent still following the isotropic normal 
distribution (4-1). The offspring of the original parent become the parents of another genera- 
tion of offspring, which in their turn become parent plants, and so on. The distance between 
any two plants of the same group also follows (4-1), but with parameter depending on the 
number of direct steps between the two plants. If the original parents are randomly dis- 
tributed, we find similarly to (4-4), 


fo(Ay,A_)—m? = mo D Ef{2n,} e~ Pte" /27000°, (5°5) 


where n, is the number of pairs of plants in the group which have the distances between 
them distributed with parameter ac. 





— 
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Models using the isotropic normal distribution for f(r) produce over-dispersion of in- 
dividuals (ratio of variance to mean in a single quadrat greater than unity). If the x and y 
components of the distance between successive plants in a chain follow independent x? 
distributions with 2f degrees of freedom (f > 2), then under-dispersion of individuals results 
because of the ‘negative correlation’ effect introduced (cf. Bartlett, 1954). The frequency 
function of the distance between plants g generations apart is easily obtained, being simply 
the product of two independent y? distributions with 2fg degrees of freedom. For a group 
with n plants (the nth being the offspring of the (n— 1)th, and so on), the original parent 
being randomly distributed and with f = 2, we have 


fo(Ay, Ag) — mm? = me 2(n — g) (Ay Ag)? [(%2— ©) (Yo = Yy) PA-} Asta 49) “Aake—w)/[ (2g — 11], 
g=1 (5-6) 
where A,, A, are arbitrary and are considered in relation to the quadrat size h, and x, > x, > 0, 
Y2>Y,29. This represents approximately an actual field example, in which a definite 
forward move is made in each generation due to vegetative spreading of the plants, and the 
probability of having two successive plants very close together is small. The other models 
discussed are developments of these two basic types and have no intrinsic mathematical 
interest. In all of them, however, an attempt has been made to keep the ecological aspect 
in mind so that they should be descriptive in some way of an idealized plant community. 


I wish to acknowledge gratefully my indebtedness to Prof. M. 8. Bartlett for suggesting 
this subject of research, and for his advice and assistance during my investigations at 
Manchester University. I also wish to thank the New Zealand Department of Scientific 
and Industrial Research for financial assistance at that time. 
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THE OUTCOME OF A STOCHASTIC EPIDEMIC—A NOTE ON 
BAILEY’S PAPER 


By P. WHITTLE 


Applied Mathematics Laboratory, New Zealand Depariment 
of Scientific and Industrial Research 


In a recent paper (1953) N. Bailey has considered a stochastic epidemic model of the type set up by 
Bartlett (1949), and has shown that the probability distribution (P,,) of the ultimate number of 
infected individuals (w) may be calculated by solving a certain set of doubly recurrent relations. 
I propose to show that for quite a general case these same probabilities may be obtained by the 
solution of a set of singly recurrent relations (eqs. (17) and (24)). Furthermore, an expression may 
be derived (eqs. (38) and (40)) for the probability that an infection introduced into a large population 
will ‘take’—this provides a stochastic equivalent to Kermack & McKendrick’s threshold theorem 
(1927 and later). 


1. INTRODUCTORY 


Following Bailey, we shall assume that an initial a infectious cases are introduced into a 

population of n uninfected but susceptible individuals, and that the probability that after 

a time ¢ there are r susceptibles still uninfected and s infectious cases not yet removed 

is p,,(t). Let BsAt be the probability that an infectious individual is removed in the in- 

finiterimal time interval (¢,¢+ At). and let A,sAt be the corresponding probability that a 

new infection takes place. No particular form is assumed for the function A, at the moment. 
The development of the probabilities p,, is then governed by the relations 


= Pr _ A,. (8 “ L) Pps 1,s-1 7 Bis + 1) Pr, si1 — (A,s + Bs) Pre: (1) 
which become 
(A+ A,8 + Bs) Vrs = A, (8 i? 1) q,. 1s-1 + Bis + L) dy s44 uw 8 4r9as (2) 
if the transformation > oc 
Gre = | e~ p, g(t) dt (3) 
0 
is performed. As Bailey observes, the probability of an epidemic of total size w is then 
A -_ icin 1? (4) 
where f,. = lim q,,- (5) 
A—) 


This limit exists as long as s > 0, and the equations regulating the appropriate f,, are ob- 
tained simply by setting A equal to zero in equation (2), with s = 1. 2. .... These are, in effect. 
the equations used by Bailey for the computation of the P... 


2. ESTABLISHMENT OF THE RECURRENCE RELATIONS 





If we write heo=0, h=af,, (¢ = 1,2,...), (6) 
B a 
oh ee (7) 

Ays1 
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then the equations for the f,, take the form 


h,, = Oph, or + Bless ov (9) 
Rae = an hy,ovr+ Ba(-q) (r = n—1,n—2,...; 8 = 1,2,...). (10) 
n+L 
Now, let F(z) = > &h,,2**'. (11) 
s=1 


Multiplying (9) by z+! and summing over s we find 


x 
H,{x) = a, PrHrsa(®) — Hphiya)- (12) 
A direct solution of (9) shows that 
b, 
hy, = EF Baal), (13) 


as, indeed, it must, if the expression (12) for H, is to constitute a finite series in x. We have 
thus 


_ b,x" ; 
H,(x) yer toad a, [ H,, (x) H,,,(&,)]; (14) 
a relation which certainly holds for r = n—1, n—2, ..., and also for r = n if we introduce 
a function xe 
Hy .s(2) = 4— (15) 


n+1 


(cf. eq. (10)). Further, by (4), (6) and (13) we have 


oi — Ben-w TF, 2041 En) (16) 
The required probabilities P,, can thus be derived by solving the simply recurrent relation 
(14) with initial condition (15), and then using (16). By doing almost precisely this we shall 
obtain an explicit recurrence relation for the P,,. With the help of (14) and (15) H,,_,,.,(x) 
can be expressed in terms of H,,_,,..9(®%,—.+1) «-» Hy(%,_,), H,.1(x). Setting x = a, _,, in this 
expression and substituting for the H,, ,(«,) from (16) we find 


u 





i, . Cn—w “ 2 
p> Kn—u +1, n—uKn—u+2, m—u*** Kn—w, “4 i 1 Kn—u+1, n—uKn—u +2,n—u*** Kn,n—u A Ona? 
w=0 n—w n+1 
(17) 
where a2 B 
kK, = 1 =—*°" (r+s). 18 
mol, hye see (rte) (18) 
The final relation of (17), for u = n, reduces to 
n 
Dre =, (19) 


provided that A, is zero, as it must be. 
Complications arise if any of the A,’s (and consequently the corresponding «,’s) are equal. 


When this happens P,, involves not only H,,,,(z) but also its derivatives. The extreme case 
in this direction is that for which A, is constant for r > 0 


4,=A (r= 1, 2,...),} 
Hon | (20) 
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as is approximately the case during the initial stages of an epidemic if the number of 
susceptibles is large. The appropriate modification of (17) may be derived by a repeated 
application of de l’H6pital’s rule, and may be shown by induction to have a solution 


— Arbor _a(a+ 2w—1)! 
w (A + B)atte wl(a+w)! | 


n-1 
0 


(w = 0,1,...,n—1). (21) 
(Formula (21) is a generalization of one obtained by D. G. Kendall in a slightly different 


context, see formula (52) of his paper of 1948.) We shall return to this case later, and shall 
for the moment consider the more usual alternative 








A, = Cr, (22) 
where C is a constant. We have 
a2.= a 2 n fh _Or+ 1) 
ee eer = BaGr’ | 
(23) 
ee Bec Ua) ety 
re (B+Cs)(r=8s) r—e ’ 


and (17) becomes 


u —w ) 


For computational purposes it is convenient to consider instead of P,, the quantity 











pan 
Q, = Ssh p. (25) 
for which s tn=u@w _ Snaw (26) 
w=0 (u—w)! u! 


3. THE PROBABILITY OF EPIDEMIC 


Let us now return to expression (21). We shall use this as a comparison formula for esta- 
blishing the behaviour of more refined models. Note, however, that the model upon which 
it is based is a perfectly valid one, which for quite large ranges of w is more realistic than that 
corresponding to assumption (22), since this assumption requires that the population mix 
homogeneously, a requirement never fulfilled in a large population. 

The following table gives the first few values of P,, as calculated from formulae (21) and 
(24) for the case a = 2, n = 30, p = B/C = 30B/A = 10: 


























w 0 1 2 | 3 
And 0-0625 | 00234 0-0110 | 0-0058 
P, jail ole tud (a) wad gad aislt aol 
| | 
A,=Olr | 0069 | 00251 0-0122 | 0-0078 
| | 





The difference between the two sets of probabilities is small but increases with increasing w. 
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We shall now adopt the following definition: It shall be said that an epidemic has (has not) 
taken place if the total proportion of susceptibles which become infected exceeds (does 
not exceed) a predetermined fraction y. With this definition the probability of no epidemic is 


ny 
1,= > P,, (27) 
w=0 
where P,, is in general given by (17). 

Consider now two other models for which the infection intensities are given by Aj, A’. 
We shall assume that in all three cases the infection intensity is non-decreasing with 


increasing vs a. > A, 
Aj.,34,7 (¢ =9,1,...). (28) 
At is>A, 
We shall further assume that, at least for the range n Sr > n(1—vy), the intensity for the 
first model lies uniformly between those for the other two 


A,> A,> Aj. (29) 
It is then intuitively evident that 
ny ny ny 
TPL < UP, < UP, (30) 
0 0 0 
Suppose now that the intensities for those two comparison models have the constant values 
A, = A, 
(r>0), (31) 
A, ? Any) 


while Aj = Ag = 0. Condition (29) is thus fulfilled, and the inequality (30) becomes 

ny ny 

- 8,(An) < Ny < ~ 8,(Ang—y»): (32) 
where we have used S,,(A) to denote the expression in (21). 


ny 
Consider now the partial sum 5} S,,(A) as n becomes large. We have 
0 


a] , 6 2, = 
Soir = AB (a+2w+1)(a+2w) _4AB f w> St 4) 


S, (A+B w+l)@t+wtl) ~ (A+B 
= 4k (say). (33) 





The quantity 4k will be less than unity, except for the case A = B, which we shall exclude 
for the moment. We can thus write 


E44) - 5 S,(4) ~ R,,(A), (34) 
where Ry(A) = ¥5 SA) <Syya3(A) [1 + (4k) + (4)8 + ...] = OL(4K)"7]. (35) 
ny+1 . 


The infinite sum in (34) has the value 


5 8,(4) = ? : <i (azz) = [F a = Il ont 
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Combining (32), (34), (35) and (36) we have thus 


A,+B-|A,-B| 
2A, re 





(37) 





2A n(i—y) 


For large the remainder term in the first member of inequality (37) is quite negligible, 
and we shall no longer include it. Note now from (36) that we have XS,,(A) = 1 or (B/A)? 
according as A is less or greater than B. Setting these evaluations in (37) we find that there 
are at least three distinct cases: 


A, >B, A,g_»)>B: (B/A,)*<7,<(B/Ajg_y)*, 
A, >B, Ayy_»<B: (B/A,)*<7,<1, (38) 
A,<B, Ayg»<B: 7,=1. 
We may sum up the situation in terms of Bailey’s removal ratio 
p, = B/C = nB/A,, (39) 
the ratio of removal and infection rates for a population of size n: 
For p, <7 and p,y_,)<(1—y), the probability of epidemic lies between 
| es (£2)" and 1— ( Patty) y’ 
n n{l—y) 
For p, <n and p,q_,>n(1—y), the probability of epidemic lies between > (40) 


zero and 1 (62)", 
nN, 








For p,, >n and p,q_,)>(1—y), the probability of epidemic is zero. 


These statements provide an equivalent in the stochastic case to Kermack & McKendrick’s 
threshold theorem in the deterministic case, at least for the case of large populations. 
Since for large n the ratio p,,/n = B/A,, will tend to an almost constant value, statements 
(40) may be roughly condensed to 
For p,,< the probability of epidemic is 1 —(p,,/n)*, | ~ 
For p,, > the probability of epidemic is zero. “oi 

The transition case p,, = n cannot be adequately treated by approximate considerations 
of the present type. Equation (28) indicates that the probability of completion is of fairly 
constant magnitude for small w, roughly of order ($)*. As the number of susceptibies 
diminish, however, the critical value of p will fall, and it seems likely that the epidemic 
will eventually be halted, although only after having made appreciable inroads on the 
population. 

The statements (36) are reminiscent of similar statements concerning the natural extinc- 
tion of populations (cf. Bartlett, 1946) and could have been derived by regarding the group 
of infected persons as a population with birth and death rates A/n and B. To reason in this 
way is unsatisfactory, however, since the condition that the infected group shall ultimately 
disappear provides no guarantee that infection will be confined to a preassigned fraction 
of the population of susceptibles. 

The argument above is probably more illuminating in the following intuitive form. The 
probability distribution P,, of (24) presents two different forms according as A is less than 
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or greater than B (see Fig. 1). In both cases P,, dwindles with increasing w, and if the popula- 
tion size is large enough to permit w to take large values P,, will finally approach zero. In 
the case A < B the sum of the P,, up to this stage will approach unity. In the case A > B this 
sum will have some value less than unity, say 1—a, so that P, must have a finite value a 
if relation (19) is to be fulfilled. 
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n small n small 
A<B A>B 
vc P. 
sil L 
ny n w ny n w 
n large n large 
a A< 8 A A > B 
t - —— 
ny n w ny a) 
Fig. 1. The appearance of the distribution curve P,, of the ultimate number of infected 


persons, on the assumption of a constant infection rate. 
For large n the probability of no epidemic 
ar 
1, = X8,(A) (42) 


will be equal to the area under the initial part of the curve: unity if A< B, «if A>B. 

The fact that all probability mass which does not fall in the first J-shaped part of the 
curve falls at w = n indicates that either the epidemic keeps within bounds (probability 
1 —«) or else it infects the entire population (probability a). 

Models with varying A show a similar, although less extreme behaviour. Thus the 
distribution curves calculated by Bailey are either J-shaped or U-shaped, depending upon 
the relative values of the removal ratio and the population size. The J-shaped curves 
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correspond to cases in which the infection is almost certainly confined to a small proportion 
of the population. The U-shaped curves correspond to cases in which the infection strikes at 
either a small proportion or a large proportion of the population, the probabilities of these 
two alternatives being equal to the integrals of the corresponding limbs of the probability 
distribution. 

What our argument asserts in effect is that for a large population the form and integral 
of the first limb of the distribution P,, is equal to that calculated on the assumption of a 
constant infection intensity. 
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A NOTE ON BAILEY’S AND WHITTLE’S TREATMENT OF 
A GENERAL STOCHASTIC EPIDEMIC 


By F. G. FOSTER 
Division of Research Techniques, London School of Economics 


1. Introduction. In a recent paper Bailey (1953) obtained a set of doubly recurrent rela- 
tions for the probability distribution (P,) of the total size (w) of a stochastic epidemic. 
Bailey also quotes an explicit formula for P,, due to the author but observes that this is not 
suitable for computation. In a note on Bailey’s paper, Whittle (1955)* showed that the P,, 
could be more simply calculated from a set of singly recurrent relations. 

In the present note, Whittle’s relations are considered. The notation is simplified by use 
of symmetric functions, and it is shown how a set of singly recurrent relations may be 
obtained for P,, in a quite general case by use of a simple probability argument. Whittle’s 
relations may then be re-derived as a special case of these relations. 

Following Bailey and Whittle, we consider a population which consists initially of n 
uninfected but susceptible individuals and a infected individuals. At any time ¢, if there 
are r susceptibles and s infected cases, the probability of one new infection taking place in 
time dt is rsdt and the probability of one infected being removed from circulation is psdt. 
The epidemic ends whenever either all infected have been removed or the whole population 
of a+n individuals has become infected. If when the epidemic ends the number of new 
infections is w (i.e. not counting the original a infections), we say that the epidemic is of 
total size w, and we denote by P,, the probability of this event. 

2. The problem may be treated as a random walk if we fix attention on the fluctuating 
number of newly infected, £, present at any time. At any instant, may be either increased 
by unity or decreased by unity, and we consider the sequence of instants at which such 
changes occur. 

Let us now represent the behaviour of £ by the motion of a particle on a rectangular 
lattice, such that a move from (x,y) to (x +1, y) represents a unit increase in £, and a move 
from (x,y) to (7, y+ 1) represents a unit decrease. Thus, at any instant, the z-co-ordinate 
will represent the total number of infected cases (in addition to the initial a) and the y-co- 
ordinate the total number of removals which has occurred up to that instant. We suppose 
that the probabilities of the moves from (x,y) to (v+1,y) and to (x,y +1) are respectively 
A,» bz (Ap +, = 1), which in general depend on x. For example, in Bailey’s problem: 

n—-2x p 
~ 2-24)’ B= n-“£+p 








zx 


The motion starts at (0,0), and stops as soon as any point of the barriers, 
L=n, y=xr+a, 
is reached. This corresponds to the end of the epidemic, and we are interested in the pro- 


bability P,, that the particle stops on (w,a+w) (w = 0,1,...,n—1). In the general case we 


define n—1 
P,=1- > P,. 
w=) 


* See pp. 116-22 above of the present issue. 
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Thus we have a Birth-and-Death process with the parameters A,, ~,, and the novelty 
resides in the fact that these parameters are position-dependent. It will be noted, however, 
that they are assumed independent of the y-co-ordinate. We study the problem for quite 
general parameters of this type. ; 


3. We now introduce some notation for symmetric functions. Denote by /,(x,m) the 
homogeneous product-sum of weight / in the (m+1) quantities, ~,, W.,1, ---, Mrim (ef. 
MacMahon, 1915). When these quantities are assumed distinct, define 


War, m) = WAM (Mss — Me) «+» (Meg — Mec g—r) (Marg — Marga) «++ (Meg — rem): 


Then we have the formula h(x,m) = ¥ h(x, m). 

j=0 
Now define glx, m) = A,Azs1 --- Agim—1/y(z,m), 
and gi?(a, m) = ApAgay «++ Apem—1 A(x, m). 


In this notation, it may be verified that Whittle’s set of recurrence relations becomes 
k 
re: DS UP(k—-1, i) P_, = (0,4) (& = 0,1,...,n—-1). 


i=0 


4. As Whittle observes, in these relations we have to assume that the y’s are all distinct. , 
We proceed now to derive the general formula, using a simple probability argument. 

When no barriers are present, the probability that the particle attains the point 
(x+m,y+l1) from the point (x, y) is q,(z,m). Now P, is the probability of attaining (k,a +k) 
from (0,0) by any path below the barrier. Therefore 


Py = dasx(0, ke) — 9x0, k) Py ea(1, b= 1) Py. qu 1, 1) Py. 


We may rewrite this as the set of relations, 


k 
Pe DY O(k—-1t.t) Peg = Marz(0,4) (kK = 0,1,...,n—1). 
i=0 


‘= 


This formula is valid quite generally, and Whittle’s formula may be obtained from it 
by use of the following relations, which are easily verified: 


k 
Pe= XD WMk-i,i)r,, (k =0,1,...,2—1). 
i=0 


5. As an example, we consider the special case where the A’s and ,’s are constant: 


A,=A, f,=H. ' 
Then q(8,t) = nw("T'). 
k 26 
Thus > (u)'(*) P,_, = Aeyorn(? es (k = 0,1,...,n—1), 
i=0 


which has the solution P, = Akuork a (" ~ ~ , 
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It is interesting to note the connexion between the stochastic epidemic problem and 
results related to the arc sine law in the theory of fluctuations in coin-tossing (cf. Feller, 
1950, p. 252). Thus the case of constant A, ~ may be applied to the tossing of a biased coin 
with probabilities A, ~ of heads or tails. Then FP, is the probability that the number of heads 
is greater than or equal to the number of tails plus a for the first time at the (2k + a)th trial. 
The probability 4% 

Pi.=1-ZAB 
k=0 


is interpreted as the probability that in (2n + a — 1) trials the number of heads is always less 
than the number of tails plus a. For the particular case a = 1, A = uw = }. we have thus 


n—-1 ] 


ree yh? 


kao 22k+1 Dk + I k 


: 2n\ . ‘ 
and it may be verified that this equals ap | a ) , in agreement with the formula of Theorem 1 
in Feller (1950, p. 252). 
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THE DETERMINISTIC MODEL OF A SIMPLE EPIDEMIC 
FOR MORE THAN ONE COMMUNITY 


By S. RUSHTON anp A. J. MAUTNER 
Imperial College, London 


1. INTRODUCTION 


In general, a deterministic model can be expected to give a satisfactory picture of a devel- 
oping process such as the spread of an epidemic, or the growth of a population, so long as 
the numbers of individuals are sufficiently large. On the other hand, if the numbers are 
small the continuous description is no longer valid; moreover, the effects of chance occur- 
rences become appreciable in any particular instance. so that a probability treatment 
is necessary. The literature on deterministic and stochastic models for epidemics has 
been reviewed by Bailey (1952): to the references given there should be added Bailey 
1953a,6). Much of this previous work refers essentially to a single community or to a set 
of isolated communities. In the present note we consider the deterministic model of a 
simple epidemic for several related communities. A simple epidemic is one in which it is 
assumed that infection spreads only by contact between individuals, and that none of the 
infected individuals is removed from circulation by death, recovery or isolation. We do 
not suggest that the model we discuss here is a realistic one in the sense that any actual 
epidemic has followed precisely this over-simplified pattern, but all epidemic models so 
far discussed have lacked realism in some sense. We do suggest, however, that the model 
discussed in this note can provide a basis for further work along these particular lines so 
that more realistic and therefore more complicated models can be treated by starting from 
our basic solution. 


2. THE DIFFERENTIAL EQUATIONS FOR m COMMUNITIES 


We denote by y,(t) the number of susceptibles in the ith community at time ¢t and by n,; 
the total size of the ith community (i = 1, ...,m). Then, on the assumptions (i) that there is 
homogeneous mixing within each community with «,; as the internal infection rate in the 
ith community and (ii) that there is also homogeneous mixing between communities with 
§,; a8 the infection rate between the ith and jth communities, the differential equations 
for the y,’s are 


a = —yAaln;—y)+ XBi(ns—y))} (i= 1,...,m). (1) 
jt 
Under the simplifying assumptions that n; =” and a, = a (all i), and that £;; = ay 
(all 1+) and taking at as the time-scale and continuing to denote it by ¢, these equations 
become dy 


—2t = —ysin—y, ¥ (n- } = 1, ...,m). : 
dt y{(n Y)+Y 2 (n y;)} (0 m) (2) 


The parameter y in these equations is then the ratio of the assumed common ‘cross- 
infection rate’ between communities to the common internal infection rate «, which takes 
account only of internally generated infection. The solutions of the set (2) are completely 





' 
i 
i 
' 
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determined by m independent initial conditions, e.g. the m values y,(0) (¢ = 1,...,m), and 
it follows that if any set of these initial values are equal then the corresponding functions 
y,(t) are identical for allt. In particular, if all the y,;(0) are equal, the set (2) reduce to a single 
seem dy|dt = —(1+(m—1)y)(n—y)y 

for y(t) = y,(t) = ... = y,,(t). This is just the equation for a simple epidemic in a single 
community with total infection rate «(1+ (m-— 1) +). 

Another particular case, whose solution is considered below in §§ 4 and 5, occurs when 
(m— 1) of the initial values y,(0) are equal and the remaining one, say y,(0), is not equal to 
the others. For example, this case arises if a single infected person appears initially in only 
one of the communities (say the first), so that 


y,(0)=n—1 and y,(0) = y,(0) =... = y,,(0) = 2. 
The equations (2) in such a case reduce to the pair of equations 
dy,/dt = — y(n — yy + (m— 1) y(n— Yo), } (3) 
dy,/dt = — yaly(n— yy) + (1 + (m— 2) y) (n— ys), 
since ¥,(t) = y,(¢) = ... = y,,(t) for all ¢. 
3. A GENERAL SOLUTION 
If the initial values y,(0) (¢ = 1, ...,m) are all distinct we can obtain a solution of the equa- 
tions (2) as follows. Writing equations (2) in the form 
dy; ; 
Ee MUTT Dyna) (¢ = 1,...,m), (4) 
3 
where a = 1+(m—1)y, and making the transformations 
y,= ny, s=nt, Y,=e*U,, (5) 
we obtain the equations 
1 dU; 
é ea ae CMO LT), ¥ U, {> ope ls 6 
U, de (U+y ZU) (i m) (6) 
Changing the independent variable to 
v = (l—e~)/a, (7) 
these equations become 
1dU, d@ : , on: a 
—— ae SS oe Je = : ys ; i=1 eee . 8 
U, de ae te ae ds ae (8) 


Denoting the matrix of coefficients on the right of these equations by A. the vector of 
elements U; by U, and denoting a column vector by { }, this set is 


d : 
do flog U;} = AU. (9) 
It follows that 
d a-¥ nea hoe n Ut\ = 1 
ao flog U;} = = log I UF") = U, (10) 


where, corresponding to A = [a,,;], we denote A-? = [a]. 
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m 
Now, putting X,= i Ue, (11) 
=] . 
™m 
so that U;, = TT X#*, (12) 
j=1 
the equations (10) become 
d Lowe ™ F eax 
= = ie = 13 
do 08%: X, dv i xy (ii x,) Xi (13) 
for i = 1,...,m. We then have 
dX; = \7 
Xx}? = (i1x,) = F(v), say, (14) 
dv j=1 
for alli. Integrating these equations we obtain 
X,=(a,-@)-"— (i= 1,...,m), (15) 
0 » 
where G(v) = (1 -”)| F(v) dv, (16) 
0 
and a; = (X,(0))7—. (17) 
From (12) and the definition of F(v) in (14) we have 
u,= (iL x,’ x7 (18) 
j=1 
= F(v)/(a,-G) (¢=1,...,m) (19) 
from (15). 
Since, from (14), (15) and (16), 
m —yKl—y) 1 d@ 
Fw) ={Ti(@-@} "= 5 (20) 
we may obtain an integral expression for v as 
l Gm yl—y) 
_ | {i (a,-6)| dG. (21) 
1—yJo limi 


By means of (19) and (21), using the expression for F(v) in terms of @ in (20), the solutions 
for U; and the time parameter v are given parametrically in terms of G. 


4, THE SPECIAL CASE OF EQUATIONS (3) 


In the special case mentioned at the end of § 2 in which y,(0) = ... = y,,(0) and y,(0) + y,(0), 
the general equations (2) reduce to the pair of equations (3). 
We see from (5) and (18) that these conditions imply that 


U,{0) = U;(0) =... = U,(0), U,(0) + U,(0) 

and X,(0) = X,(0) = ... = X,,(0), X,(0)+ X,(0). 
Defining (X,(0))”-1 = a, (22) 
(X,(0))"2 =f for i=2,3,...,m, (23) 


it follows from (16) that 


U, = F(v)/(a-@) (24) 
and U, = F(v)/(2-G) (i = 2,...,m). (25) 











11) 


12) 


13) 


14) 


15) 


16) 


17) 


18) 


(19) 


(20) 


(21) 


ions 


(9), 


(22) 
(23) 
(24) 
(25) 
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Now put Z=U,/U,, W=U,-U,, (26) 
so that U,=W/(1-Z), U, = WZ/(1—Z), (27) 
and note that if y,(0) > y,(0), which we may take to be the case, then y,(t) > y,(¢) for all ¢, 


and it follows that U,>U,, so that W>0 and 0<Z<_1 for all t. Also we must have a>. 
Expressing W in terms of Z by means of the above equations and the expression for F(v) 


in (20). we obtain W = A{(1—Z)“/Ze4y, (28) 
where A = 1/(1—y), ~ = 2+(m-—2)y and A is an arbitrary constant = (« — £)4-». 
From (20) we have de x 
dG F(v) 
It is easily seen that this is equivalent to 
is HW -" 


so that, corresponding to (21), we may express v in terms of Z as 
Zz 
al . { {(1—Z)XZ'-Y}- d/Z, (30) 
0 


where B is a second arbitrary constant. The integral in (30) is an Incomplete Beta Function 
which can be expressed in terms of the hypergeometric function. In practice, however, it 
is preferable to evaluate the integral in (30) directly. 


5. PRACTICAL FORM OF THE SOLUTION 
Considering the practical form of the solution of § 4, we note that if we choose a new time- 
parameter T = (1+(m—1)y)v = (1—e“A+m—D 8, (31) 
where we recall that we have put s = nt, then 0 < T < 1 and the whole history of the epidemic 


is shown on a standard time scale extending over a unit interval. Equations (28) and (30) 


constitute the solution to our problem, the solution being expressed parametrically in terms 
of Z = U,/U, = ¥,/¥, = y/yo- 


Defining g(Z) = {(1-Z)4“/Z-}, (32) 
then, from (28), (30) and (31), 
W = Ag(Z), (33) 
and T =C-A(y- yf lwayaz, (34) 
= 0- (Xu 1)/A) [az] Zg12)). (35) 
The initial values of W and Z are ‘ 
Wo = U,{0) — U,(0) = ¥,(0) — ¥4(0) = (y2(0) — y,(0))/m, (36) 


and Z, = y,(0)/y2(0). We notice incidentally that if y,(0) = n, so that initially there are no 
infected persons in any community other than the first, then Z) = 1— W. 
The arbitrary constants A and C are clearly given by 


A = Wi/9(Z) (37) 
and C = (A(u—1)/A) i) "dZ|(Zg(Z)). (38) 


0 
9 Biom. 42 
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From equations (5) and (31) the proportions Y, and Y, of susceptibles in the com- 
munities at any time are 


Y, = (1—T)U, 
= (1-1) W/(1—Z) = A(1—7)9(Z)/(1—Z) (39) 
in virtue of (27), and Y, = ZY. (40) 


Table 1. Nwmerical procedure: y = 0-2, m = 2, Zy = 0-95 


























Z aZ | dY, aY, | dY, ay, 
me ” f, ZZ) . me gm da e i-a 
| 3 
| | 
| 095 | 5-96x10-* | 57-03 | 0-000 | 0-00 | 1000 | 0-950 | 0-010 | 0-047 | 0-057 
| | 
| O94 | 953x10-+ | 42-94 | 0-252 | 0-24 | 0-996 | 0-937 | 0-016 | 0-060 | 0-077 
| 0-93 | 1-42x10-* | 33-73 | 0-417 | 0-45 | 0-992 | 0-923 | 0-023 | 0-073 | 0-096 
| 0-92 | 201x10-* | 27-32 | 0-531 | 0-63 | 0-987 | 0-908 | 0-031 | 0-086 | 0-117 
| 0-91 | 2-73x10- | 22-65 | 0-615 | 0-80 | 0-981 | 0-893 | 0-039 | 0-099 | 0-138 
| 0-90 | 361x10-* | 19-13 | 0-678 | 0-94 | 0-975 | 0-877 | 0-049 | 0-112 | 0-161 
| 0-89 | 4-64 x 10-8 | 16-40 | 0-727 1-08 | 0-968 | 0-862 | 0-057 | 0-123 | 0-181 
| 088 | 585x10-* | 14-23 | 0-765 | 1-21 | 0-959 | 0-844 | 0-069 | 0-138 0-207 
| 0-87 | 7-25x10-* | 12-47 | 0-797 | 1-33 | 0-951 | 0-827 | 0-080 | 0-151 | 0-231 
0-86 | 8-85x10-? | 11-03 | 0-823 | 1-44 | 0-941 | 0-809 | 0-092 | 0-164 0-256 
| 085 | 1:07x10-* | 983 | 0-844 | 1-55 | 0-930 | 0-791 | 0-104 | 0-177 | 0-280 
0-84 | 127x10-? | 881 | 0-862 | 1-66 | 0-915 | 0-769 | 0-120 | 0-191 | 0-310 
0-83 | 1-50x10-? | 7-95 | 0-878 | 1:75 | 0-907 | 0-753 | 0-129 | 0-200 | 0-330 
0-62 | 1-76x10-* | 7-20 | 6-891 | 1:85 | 0-893 | 0-733 | 0-143 | 0-212 | 0-355 
0-81 | 205x10-? | 656 | 0-903 | 1-94 | 0-880 | 0-713 | 0-157 | 0-222 | 0-379 
0-80 | 236x110 | 5-99 | 0-913 | 203 | 0-865 | 0-692 | 0-170 | 0-232 0-402 
| 
0-75 | 4-48x 10-2 3-99 | 0-949 | 2-47 | 0-774 | 0-580 | 0-240 | 0-270 0-510 
0-70 | 7-:70x 10-8 2-81 | 0-970 | 2-91 | 0-654 | 0-458 | 0-297 | 0-280 0-577 
0-65 | 1-24 10-2 205 | 0-983 | 3-41 | 0-500 | 0-325 | 0-318 | 0-252 0-569 
0-60 | 1-92 10-2 153 | 0-992 | 408 | 0-301 | 0-181 | 0-260 | 0-173 0-433 
0-55 | 2-87x 10-2 1:16 | 0-999 589 | 0-046 | 0-025 | 0-052 | 0-029 0-081 
| | | 











In terms of Y, and Y, and the time parameter s (0 <s <00), where 
8 = flog, (1—T)}/(1+(m—1)y), (41) 


the epidemic rate —dY,/ds in the first community and —dY,/ds in the other communities 
are given by 

Ti Een I 7 ‘ai 
dY,|ds = —Y_(y¥, + (1+ (m—2)y) ¥y—(1+(m—1)y)), 


and the total epidemic rate is — (dY,/ds + (m— 1) dY,/ds). 
For communities of total size n the actual epidemic rates are just —dy,/dt = —n*dY,/ds. 
The numerical procedure in applying the solution is illustrated in Table 1 for the simple 
case of m = 2 communities with y = 0-2 and initially 5% of infected individuals in the first 
community and none in the second. In this case Y,(0) = 0-95, ¥,(0) = 1, W = 0-05, Z,) = 0-95, 





2m- 


(39) 
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A = $ and yu = 2. The arbitrary constants are A = 83-8712 and C = 1-0199. The principal 
part of the numerical work is the computing of g(Z) and the integral [ dZ|(Zg(Z)). In this 
example, the epidemic is effectively completed by the time that Z nad decreased in value 
to 0-55, when 7’ = 0-999, and the values of g(Z) and [ dZ|(Zg(Z)) are therefore only shown 


for 0-95> Z>0-55. If the epidemic rates —dY,/ds and —dY,/ds are plotted the curve for 
— dY,/ds is at first below that for — dY,/ds as we should expect, crosses the latter at about its 
maximum, rises to a slightly higher maximum and remains above the curve for —dY,/ds 
until the completion of the epidemic. The curve for the total epidemic rate falls somewhat 
more slowly from its maximum than the rate at which it rises. We have examined a number 
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Yr 
Fig. 1. (Y,, Y,) curves for m = 2 and y = 0-2 and different initial values: 
Y,(0) = 0-99, 0-97, 0-95, 0-90, 0-80, 0-70 and 0-50 and Y,(0) = 1 in all cases. 


of cases, for different numbers m of communities and different values of y, and this last 
feature appears to be generally true and is particularly noticeable for small m (e.g. m= 2) 
and reasonably large values of the initial amount of infection in one community (e.g. above 
10%) before infection in that community begins to affect susceptibles in the others. 
This calls to mind the remark of Ross (1916): ‘It is obvious from the mere examination of 
many curves of epidemics that they are often remarkably symmetrical bell-shaped curves 
which, however, frequently tend to fall somewhat more slowly than they rise....’ 

A method of illustrating the solutions of the epidemic equations is shown in Fig. 1, where 
for different initial conditions (Y,,Y,) curves are plotted for the case m = 2 and y = 0-2. 
The curves shown are for Y,(0) = 0-99, 0-97, 0-95, 0-90, 0-80, 0-70 and 0-50, and Y,(0) = lin 
all cases. Contours for the time parameter s, giving the times that points on the (Yj, Y,) 
curves are reached from the start of the epidemic, and contours showing the value of the 
total epidemic rate as the epidemic proceeds can also be put on this same graph. To avoid 

9-2 
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confusion they have not been included in Fig. 1. but the latter are in fact conics. In this 
particular example. we see from equations (42) with m = 2. that the contours 
— (dY,/ds + dY,/ds) = k 

are concentric ellipses { 

Y}+2y¥,¥,+ Y3—-(1+y)(%i+h) =4. 
having their common centre at the point Y, = Y, = }, their minor axes along the line Y, = Y, 
of lengths ,{(l+y+2k)/(2(1+y))} and their major axes (perpendicular to Y, = ¥,) of 
lengths equal to ,{(1+y)/(1—y)} times the corresponding minor axes. 


REFERENCES 


BatLey, N. T. J. (1952). Appl. Statist. 1, 149. 
Bartey, N. T. J. (1953a). Biometrika, 40, 177. 
Baritey, N. T. J. (19536). Biometrika, 40, 279. 
Ross, R. (1916). Proc. Roy. Soc. A, 92, 204. 





shis 


[ 133 ] 


EXACT TESTS FOR SERIAL CORRELATION 


By E. J. HANNAN 
Australian National University, Canberra, A.C.T. 


1. IytTRoDUCTION 


In testing for first-order serial correlation in a stationary time series it is customary to use 
some form of the first sample serial correlation coefficient. Though this statistic will, 
presumably, be a most efficient estimator of the serial correlation p it has two disadvantages: 
(a) its exact distribution is very complicated so that approximations have to be used; 
(6) it is a biased estimator and the analytical form of the bias is not known. 

Ogawara (1951) has obtained an exact test for serial correlation by considering the 
conditional distribution of the ,, for fixed values of the x,,_,. His statistic 6, is an unbiased 
estimator of 2p(1 + p?)-1, where p is the parameter of a simple Markoff process. At first sight 
it seems obvious that the resulting estimator of p will be inefficient. However, in this paper 
it will be shown that its efficiency tends to 1 as p tends to 0. It follows that Ogawara’s 
statistic leads to an exact test of the hypothesis which is asymptotically fully efficient (this 
criterion is defined and justified below). Computationally the test is as simple as that based 
on the first serial correlation coefficient. 

Another statistic, 6,, is available which is also an unbiased estimator of 2(1 + 2)-}. 
This statistic is the coefficient of regression of the x,,_, on the neighbouring 2,,. The efficiency 
of the estimator of p based on the statistic 4(6, +6,) is here shown to be (1 —?). While no 
exact distribution theory is available for this statistic it could be useful in circumstances 
where an estimate of a common serial correlation coefficient is required from a number of 
otherwise differing simple Markoff processes, since it is also an unbiased estimator of 
2p(1+p?)*. 

Ogawara also gave the extension of his results to the case of a multiple Markoff process 
of order h. In general, however, it appears that the efficiency of his estimates of the coeffi- 
cients of the autoregressive equation will not rise above 2/(h + 1), so that only in the first- 
order case are they ever fully efficient. 

The problem of testing for serial correlation in the residuals from a regression equation 
has been considered by Moran (1950) and Durbin & Watson (1950, 1951). Moran obtained 
asymptotic formulae for the mean and variance of the first circular serial correlation coeffi- 
cient computed from the residuals from a least-squares regression on a single independent 
variate. The exact distribution of the test statistic used by Durbin & Watson is of a com- 
plicated form and is only known for regression vectors satisfying certain conditions. How- 
ever, upper and lower bounds to the significance points, valid for any regression vectors, 
have been tabulated. In cases where the test statistic falls between the appropriate bounds 
an approximate test was suggested, based on the use ofa beta distribution with the same 
mean and variance as the test statistic. ' 

In this paper it will be shown that Ogawara’s method can be extended to give an exact 
test, for the independence of the residuals from the regression equation, based on the z 
distribution. Again, a statistic is obtained which is an unbiased estimator of 2p(1+?)-* 
in the case where the residuals follow a first-order Markoff scheme. The estimator is, as p 
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tends to zero, asymptotically as efficient an estimator of p as the first serial correlation 
coefficient of the residuals and Durbin & Watson’s statistic. It again follows that the 
statistic leads to a test which is asymptotically fully efficient. 

At the same time an exact test against given values of the regression coefficients is 
obtained in the case in which the residuals follow a simple Markoff process. 

One disadvantage of these tests results from the necessity of computing a separate least- 
squares regression involving 2k+1 regressors, where k& is the number of ‘independent’ 
variates in the original regression equation. The estimates of the coefficients in the original 
equation, obtained from this regression, will, under some circumstances, have an asymp- 
totically smaller variance than the estimates obtained by straightforward least squares. 
However, over a wide range of these conditions the estimates of the regression coefficients 
obtained from the first differences of the observations will have an even smaller asymptotic 
variance, and over this range at least the variances of these last estimates will almost 
certainly be smaller for small samples. 

Finally, it should be emphasized that, though certain exact tests have been obtained, the 
powers of these tests have been judged from their asymptotic properties. For really small 
samples these tests may be far from optimal. 


2. THE CRITERION OF ASYMPTOTIC RELATIVE EFFICIENCY OF TESTS 


If t, and ¢, are statistics, computed from a sample of size n, their asymptotic efficiency as 
tests of an hypothesis specified by a parameter value @, is defined as 





iF g i \3 
£ F(t) 1 
B(ty,ty| >) = tim 2" dont) Vit |= 0) 


—>K V(t, | 6=0,) | 0 | e. 
a0, 


Here &(t;) and V(t;) are respectively the expected value and variance of t;. 

The justification for this criterion rests upon a theorem due to Pitman (see Stuart, 1954). 
Pitman has shown that, under certain regularity conditions, for t, and ¢, with limiting normal 
distributions and variances of order n~', the ratio of the sample sizes required for t, and ¢, 
to have the same power against alternative values of @ which differ from 6, by quantities 
of order n- is in the limit given by H(t,, t, | 4). 


3. THE TEST FOR SERIAL CORRELATION IN A STATIONARY AUTOREGRESSIVE PROCESS 


Ogawara considered the stochastic process (x,—m) = p(2,_,—m)+€,, where é, is normally 
distributed with 0 mean and variance o7(1 —p?), and m is the mean of the process. Here 
lp|<1. 

He showed that in the conditional distribution of the x,, (¢ = 1....,) for fixed values of 
Ly, (t= 1,...,n+1), the parameter 6 = 2p(1+,?)-! appears as the regression coefficient 
of the x on the fixed variates }(ry_,+2%y,,) = 2}. Ogawara pointed out that the statistic 


n 
“ (x; — 2’) (%y— Zp) 


Sm Ta 


r as 
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is then the maximum-likelihood estimator of 6, and is unbiased, and that 


F = ——_—__——___(n-2), 


n A 
¥ (xy — a — 6, 2;)? 
1 


which has the F distribution with 1 and n—2 degrees of freedom, can be used as an exact 
test for any assigned value of 6. Here @ = %,—5,2’. 
For the test of significance of the hypothesis p = 0 the statistic F reduces to 


(n— 2) r2(1 —r?)-, 
where r is the coefficient of correlation between x, and 4(x,_, + %;,;). This test is, therefore, 


computationally as simple as that based on the first serial correlation coefficient. 
The variance of 6,, in the conditional distribution of the 7,,, is 


o*(1 —p*) l 





= : 
— 

x (%—- 2)? 

1 

and since this converges in probability as n increases to 


o(E—p) 4 - +2 (t—p*) 
(1+) 2no%(1+p) (1 +p?)?’ 





the conditional variance of 6, also converges in probability tot 





2 1-¢* 
i n(i+p*)*" 
(zxiven 6, we can estimate p by ‘ 
. _ 1—J(1—6}) 
Te 
1 


This is not, of course, unbiased. Its variance will be, in the limit, 
2 1—p? (1+p*)* _ 1 (1+?) 
m(1+p?)?4(1—p*)®— 2n (1—p*) 
The variance of r,, the first sample serial correlation coefficient, will tend to $(1 —p*)/n 


(Bartlett, 1946), and since it seems that this will be asymptotically most efficient (Wald, 
1948) the efficiency of Ogawara’s estimate, /,, will be 


‘1—p? 2 
(55) 


The efficiency of /, for certain values of p is shown in Table 1. 








Table 1 





iP} O-1 0-2 0-3 0-4 0-5 , 0-6 0-7 0-8 0-9 





Efficiency of p, 0-96 0-85 0-70 0-52 0-36 0-22 0-12 0-05 0-01 






































n -1 
* For any observed 6, the variance, for fixed 2,,, will be o7(1 -erfa +p") > (x; -7y| of course. 
1 
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Another unbiased estimator of 6 is available in the statistic 


n—1 
a ~ (Xj —%") Mors 
bs ~ n=l bck’ in 
xX (a -%")P 


where 2; = }(%,+ 2,2). 
The statistic 6 = $(6,+4,) will also be unbiased and its variance will be 
l—p 


2 A A 
522 + 4cov (6,5,). 


} var (6,) +} var (6) +}eov (b, .) = nl ey: 


To the order n-! the sample means can be neglected and the covariance of 6, and 6, 
will be (to this order): 


neyo, COV (4,4) + ee cov (C,C,) — RS {cov (a, C,) + cov (c,a5)}, 
1 2 ] 2-1 
where a= Gin, eeu Mar—1 + Tae41): So Ge ~ Lop 1(Voy + Xoyp+2)> 
1 2” | n-1 
oy on (Xay_-y + Xoyj+1)"; “ol ~ (Lay + Wo440)?. 


By straightforward algebra we obtain, to the order n-1, 


a4 ooh [pee ee p* mee ea 


cov (6,5.) : 











~ nl(l+p?P 1—p! J" (1+p?)4 in 
20 12p + 20p? + 4p* — 4p|| 
; Ca Rs rs | 

The asymptotic variance of 6, therefore, is 

m4 1 — p* 

ar (b.)|———_ I. 
var Lara 
Hence, the variance of the estimator f = [1—./(1—6?)] .-nds to (2n)-}, so that its 
efficiency is (1 —p?). 

Table 2 shows the efficiency of this statistic for certain values of p. 

Table 2 
|p| 0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 
Efficiency of p 0-99 0-96 0-91 0:84 0-75 0-64 0-51 0-36 0-19 






































Though no exact distribution theory is available for this statistic, yet, in spite of its 
inefficiency, it could prove useful in circumstances where an estimate of a common serial 
correlation coefficient is required from a number of otherwise differing simple Markoff 
processes.} The lack of bias in }(5, + 6,) would enable the several estimates to be combined 

+ An example (due to E. J. Williams) has arisen in connexion with the serial correlation of measure- 
ments at points along a log. The averaging of estimates from different logs, the only method of sub- 


stantially increasing the number of observations, would not result in a consistent estimator if the 
estimates were obtained from the first sample serial correlation coefficient. 
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and an estimator of p to be obtained which would be consistent. The loss of efficiency, for 
moderate p, would not be large. 

The statistics 6,, 6, ané b = 4(6, +6,) have one major disadvantage however. The use of 
the formula b; = (1— (1 —62)]/6, 
will lead to a consistent estimator of the first serial correlation coefficient only if the under- 
lying process is truly a simple Markoff process. 

The fact that the efficiency of f; tends to 1 as p tends to 0 is important, however, for this 
shows that the test of significance based on 6, (j = 1, 2) will be asymptotically fully efficient. 
Though nothing certain can be said, from this result, about the power of the test in small 
samples it seems likely that it will be nearly as powerful as that based on r, against the range 
of alternatives represented by simple Markoff processes. The test based on 6; has the 
advantage of being exact. 

Ogawara extended his results to a multiple Markoff process of order h. In this case the 
regression of the variate x,,,,, on the variates x, = 4(%j,41)+)+%aiv—p) for p=1,...,h 
is considered. In the first-order case the basic test statistic is the correlation coefficient 
between x, and 4$(x_; +22,.,), and the numerator of this statistic (apart from the factor 4) 
differs only in the corrections for the means from the numerator of the first sample serial 
correlation coefficient.The corresponding test statistics in the general case are the total and 
partial correlation coefficients between zy,,,) and the x}. Now, however, certain cross- 
products are omitted as compared with the total and partial sample serial correlation 
coefficients so that the power of the tests is reduced. It seems that the efficiency of the 
estimates of the coefficients of the autoregressive equation (from which the asymptotic 
power of the tests can be gauged) will be at a maximum when all of the coefficients are zero. 
In this case the efficiency is 2(h + 1)-'. In general, it will be lower. For example, the efficiency 
of the estimate of a,, in the process x,+4,2%;_; +@_2_. = & in case a, = 0, is 3(1 +aj)-. 
It appears, therefore, that although exact tests for partial serial correlation are obtained 


by Ogawara’s approach these tests will be less powerful than those provided by the usual 
serial coefficients. 


4, THE TEST FOR SERIAL CORRELATION IN THE RESIDUALS 
FROM A REGRESSION EQUATION 


Consider the regression Yy = 2+ By ty t Boyt... + Betat ep 


where (1) & = pe,_,+ and y, is N(0,o(1—p*)*); |p| <1 
(2) The e, are independent of the x». 


Then the conditional distribution of the y,, for fixed x; , and fixed ¢y,, (¢ = 0,...,”) is 





1 2 2k+1 2n 
(270%)-*" exp—s— 2D (Ya- 70 >» v2) TI dy. 
“09 t=1 j=1 1 
o*(1 — p*) 
Here = l+p?” Yo = &(1—Yp41); 


¥;=8; and 2;,=2%7, (j =1,...,4), 
2p 
rg ten I+22 and 244 = (Yours + Yarra): 


V3 => — Veoh; and =5,t = (2; 9-1 +2} 9441) (J => k+ 2, ears 2k+ 1). 
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If the information involved in the knowledge that y;.4.; = —Yei17; (j = 1,--.,&) is 
discarded and the coefficients y; are estimated by the straightforward least-squares pro- 
cedure, these estimates will have the usual properties of least-squares estimates in the classic 
case. They will not be maximum-likelihood estimates. However, the computation of the 
maximum-likelihood estimates will involve the solution of a system of second-order 
equations and no exact tests will be available. 

Representing the least-squares estimates of the y; by 7,;, the hypothesis p = 0 can be 
tested by the use of the statistic 

L 
F, n-2k-2 = (n— 2k— 2) veh 
[Lisseral 
which has the F distribution with 1 and n—2k—2 degrees of freedom. Here o*? is the 
variance of the residuals from the least-squares regression of yy on the z, while L is the 
covariance matrix of the z,: | L;.,1,,,1| is the cofactor of the element in the indicated row 
and column. The test is, of course, equivalent to a test of the partial correlation of y., and 
$(Yy-1 + Yo41) With the effects of the zy (j +k +1) removed. 

Tests of significance and confidence limits for the parameters £; can be obtained from 

statistics such as 
n L a 
Fy n—2k-2 = (n— 2k — lo,-ay el ee ty OE, Se 
which have the F distribution with 1 and (n—2k—2) degrees of freedom. Similarly, the 
multiple correlation of y and the x; may be tested by the statistic 
(n — 2k—2) of kite =e 
Fi, n—2k-2 = Rees RES s vid 9, hae oe, 


G=1 





Here A is the matrix formed from the first k rows and columns of L-". 

These last tests are exact for any value of p, though the power of the tests will depend 
upon that parameter. 

In order to examine the limiting variances and covariances of the estimates of the y; it 
is necessary to obtain the stochastic limit of the matrix L. 

If the vector {z,} is written Z, = {Z,,,2,.1;, Za} and the corresponding vector of sample 
means Z = {Z,, 2,1. Zs}, then the covariance matrix L is 


(Zy—Z,)% (Bu Blea,  ‘(Zy- ZZ 


> (Zi4-1,¢— 241) Zur (41,0 241) Ze+1e (Zpessj¢— 241) Zoe | - 


(Zn, —Z,) Zi, (Za—Zdtus (Zy—- 2% 


1 
Ni= 


Since 2,,;, = &+P’Zy+4(€y_1+€y,,), where B is the vector of regression coefficients, 
under fairly weak restrictions on the nature of the processes generating the xj, this matrix 
will have a limit in probability of the form 


Xo X,8 X, 
BX, jo%(1+p*)+B’X.8 PX, 
Xi X,8 X, 


where Xy, X, and its transpose and X, are the stochastic limits of the matrix in the corners 
of L. 
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Since the determinantal value of L will be almost everywhere different from 0 (if the 
joint distribution of the ~, is not singular) the matrix L— will converge in probability to the 
inverse of the matrix just written. 

Since the covariance matrix of the ; (for j > 0) is oj(nL)-", it is readily seen that the 
variance of ),., converges in probability to 


2 1-p? 
n(1+p?)? 


Hence the variance of the estimator of p obtained from this statistic is 


1 (1+ ?)? 
2n (1—p?)’ 


as in the case of the estimation of the parameter of a simple Markoff process by Ogawara’s 
method. 

The test statistic. d. used by Durbin & Watson. is asymptotically equal to 2(1—r,), 
where 1, is the first serial correlation coefficient computed from the actual residuals from 
a least-squares regression. It can be seen from the formulae (5), (6), (7) and (8) given on 
p. 164 of their paper (1951) that the variance of this statistic (when p = 0) will be. asymptotic- 
ally, 2n-! when computed from 2n observations. It follows that the estimator of p obtained 
from },,, is as efficient as that gained from the statistic d. when p = 0. The asymptotic 
relative efficiency of d and 9,,, against the hypothesis p = 0 is therefore unity. It is easy to 
show that the maximum-likelihood estimator of p will have an asymptotic variance (2n)-" 
(from 2n observations) when p equals zero. It therefore appears that d and ,,,, lead to tests 
of the hypothesis, p = 0, which are asymptotically fully efficient. 





The covariance matrix of the }; (j = 1, ...,£) converges in probability to 
212 
Cishlx 
n1+p? 
where X is the matrix in the top left-hand corner of the inverse of 
X, x | 
xX X 
If the x, are serially independent the covariance matrix of the }; (j = |, ...,&) becomes 
in the limit 
el-p.. 
ni+p> °° 


But these },; are unbiased estimators of the £;. The covariance matrix of the straightforward 
least-squares estimates of the f; is, in this very special case (Wold, 1953, p. 213), 


The relative efficiency of the estimates of the £; by the two methods is then 4(1 + p?) (1 —p?)~". 
The estimates obtained from the 9; will, in this case. be more efficient than those from least 
squares when p./3> 1. 
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In general, the relative efficiency of the two methods will depend upon the correlogram 
of the z,, as well as p. For example, in the case where there is only one regressor, which 
follows a simple Markoff process with parameter p,, the variance of }, will be, in the limit, 


o? 1—-peas 


var (91) = 7 \ix pt) (1 — 8) 09? 





where @? is the variance of 2. 
The variance of the straightforward least-squares estimate from 2n observations will then 
tend to (Wold, 1953, p. 211) 
| ae Sinem eee 8 
var (f,) = nina 





Table 3. var (2,)/var (7) 












































p — 0-8 — 0-6 —0-4 — 0-2 0 0-2 0-4 0-6 0-8 
Pr 
0 2-28 1-06 0-69 0-54 0-50 0-54 0-69 1-06 2-28 
0-2 1-52 0-77 0-54 0-46 0-46 0-54 0-75 1-25 2-90 
0-4 0-85 0-47 0-36 0-33 0-36 0-42 0-69 1-26 3°20 
0-6 0-38 0-24 0-20 0-20 0-24 0-32 0-63 1-06 3-05 
0-8 0-11 0-08 0-08 0-09 0-11 0-16 0-29 0-66 2-28 
A 
The relative efficiency var (f,)/var (9,) 


is shown in Table 3. This table will also give the relative efficiencies of the 7; (j = 1, ...,*) 
for a k variate regression when each of the x; follows a simple Markoff process with the same 
parameter pj. 


The limiting variance of 9, is ao? 1—p? 


The corresponding estimate of x is }o(1—,,,)~1, and this has the asymptotic variance 
o 1l+p 1 
n =pR 


The variance of the straightforward least-squares estimate of « will tend to 


so that the estimate obtained from }, is always relatively inefficient and is very inefficient 
for values of p near 1. In such cases a better estimate than that obtained from ), would be 


k 
& = ¥—>};%;, where the means are obtained from all of the observations. 
1 


A common procedure, when positive serial correlation is suspected in the residuals, is to 
estimate the regression coefficients from the first differences of the y, and xy (see Cochrane 
& Orcutt, 1949). Again the efficiency of this method will depend upon the correlogram of 
the x,. When there is only one regressor, which follows a simple Markoff process with 





ne 


E. J. HANNAN 141 


variance o} and serial correlation p,, the asymptotic variance of the least-squares estimate 
of £,, from the first differences, can be shown to be (when estimated from 2n observations) 


o 1 (1—p) (3—ppi—p—pi\ 
of 4n (1 — py) l-pp, | 
If p > 0 this will be smaller than the variance of the estimator 4,. 

It must be emphasized that the validity of the tests of the regression coefficients, together 
with the estimates of these coefficients (including the estimate of p derived from },,,), 
depends on the specification of the errors in the regression equation as a simple Markoff 
process being correct. The test of significance of p is, of course, always an exact test. 

Example. Durbin & Watson (1951) use economic data due to A. R. Prest (1949) to demon- 
strate their method. The original data, given in their paper. cover the period 1870-1938, 
and show the logarithm of the consumption of spirits per head in the United Kingdom (y,), 
the logarithm of real income per head (z,,) and the logarithm of the relative price of spirits 
(x»,). The estimates of the coefficients from the least-squares regression of y on x, and z, are 


2, =—-0-120, B, = —1-228. 





Their statistic d = 0-2488 and is far below the lower bound to the significance point for 
d at the 1% level, indicating a highly significant positive serial correlation. 

We will apply the method presented in this paper to this example. 

The coefficients 7; are given by 








y, [ 0307 0-490 -0-651 0-295 0-495] -—0-636] [ 0-795] 
%e 0-490 1-445 -—1818 0-474 1-423 — 1-818 — 0-693 
$4) =|] —0-651 -—1-818 2-423 -—0-631 —1-815 2-401} =| 0-950 
Ms 0-295 0-474 -—0631 0-288 0-478 — 0-621 — 0792 
L>,) L 0-495 1-423 -1-815 0-478 1-417] L—1-804] 0-630 | 

















5) = 0-218, oF? = 0-000232. 
34 

The elements of the 5 x 5 matrix are ¥ (2;,—2;) (%—2,;). For example. 
1 


0-495 = Ud(%e 9-1 + Xe, 11 — a) (24, — %1,2); 


where Ho = ge UF(Xe oy -1 + Xe, 41), 
Tye = ge UA,» 


All of the parameters ),; are significant at the 1% point. The value of F corresponding to 
J, equals 308 in fact. However, the 5 % confidence intervals for y, and y, do not include the 
straightforward least-squares estimates. They are: 


0-339 <y,< 1-251, —0-991<y,< —0-405. 


On prior grounds the value of — 0-120 for B, is unacceptable, for it is very unlikely that 
the demand for spirits would fall as income rises. In the original work of A. 2. Prest (1949) 
the coefficients were estimated from the first- differences of the observations and the 
resulting regression coefficients were 

2, = 0-736, f, = —0-861, 
both of which lie within the 5 % confidence intervals for y, and y.. 
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The variates x, and x, have high positive serial correlation coefficients of about the same 
value. Together with the high positive serial correlation of the residuals this suggests that 
the extension of Ogawara’s method here presented will give estimators of £, and £, which 
will be at least as efficient as those from the straightforward least-squares procedure. This 
is to some extent borne out by the results. 


I wish to thank Prof. P. A. P. Moran for suggesting this subject and its developments to 
me and for his advice in the research done and in the preparation of this paper. I should 
also like to thank the referee for a number of suggestions and, in particular, for pointing 
out the importance of the first difference transformation. 
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ON THE EFFICIENCY OF PROCEDURES FOR SMOOTHING 
PERIODOGRAMS FROM TIME SERIES WITH 
CONTINUOUS SPECTRA 


By M. 8. BARTLETT anp J. MEDHI 
University of Manchester 


1. INTRODUCTION 


Let X, (r = 1, 2,...,) be a set of consecutive observations from a real stochastic process 
X, in discrete time. The process is assumed stationary, at least up to the second order, so 
that the stochastic average H{X,} is a constant (m), which for convenience is put zero, and 


the autocovariance function E{X,X,.,} = W, = 7%, (1-1) 


is a function only of the interval s. Then an autocorrelation function p, (s = 0, + 1, + 2,...) 
is a valid one for some stationary process X, if and only if 


Ps -| cos (8A) dF(A), (1-2) 

0 
where F(A), the (integrated) spectrum of the process, has the form of a distribution function 
defined in (0,7) (Wold, 1938). Ignoring the possibility of a third singular component, we 
have further the relation F(A) = 2 F(A) + F(A), (1-3) 


where F,(A) is a step function, the (integrated) discrete spectrum, and F(A) has a non-zero 
derivative f,(A), the continuous spectrum. In particular when c, is zero (as will now be 
assumed) 


Ps = | "cos (sA) f(A) da, (1-4) 
which has the inversion formula J 

f(A) == Ep, 008/02). (1-5) 
We define g(A) = 20%, f(A), 
so that g(A) = 25 Ms cos (sA). (16) 


The problem of estimating directly the spectral density function f(A) (or g(A)), previously 
considered by Daniell (see the discussion to the paper by Bartlett, 1946), Bartlett (1950) 
and Grenander (1951), will be discussed further in this paper. Grenander & Rosenblatt 
(1952, 1954) have recently investigated also the problem of constructing an entire con- 
fidence band for the integrated density function g(A); for further suggestions on this 
problem, which will not be considered here, see Bartlett (1954). 


2. ESTIMATES OF THE SPECTRAL DENSITY FUNCTION 
The sample autocovariance, which will be defined as 


12-8 
n D> X Xvi (|s| <n), 
r=1 


C, = (2+1) 


0 (|s|>n), 
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is, under wide conditions, a consistent estimate of w,. If now we substitute C, for w, in 
(1-6), we obtain 9 nm 


2 > C, cos (8A) =— y X,X,cos[(u—v)A] 
s=—n  nv=1 ' 
= I(A), (2-2) 
where J(A) is the periodogram intensity G(A) G*(A) = A*(A) + B?(A), G(A) being calculated as 
di > X,e* (2-3) 
Nr=1 


(usually for integral p, where A = 27p/n). However, in spite of the relation (2-2), it is now 
known that J(A) does not provide a consistent estimate of g(A), owing to its large sampling 
fluctuations which do not diminish as n increases. The relevant stochastic properties of J(A) 
have been discussed by Slutsky (1927), Bartlett (1950, 1954), Grenander (1951) and others, 
and will merely be quoted when necessary. The further condition imposed on X, for these 
properties to hold is that, if not normal, at least it has the linear form 


«a 
X, = p 2 by Yu (2-4) 
u=0 
where Y, is a sequence of independent quantities with zero mean and constant variance o°. 
In view of the sampling properties of /(A), ‘smoothed’ estimates of g(A) have been 
proposed. Thus Daniell put forward the estimate 


} (Ath 
gp(A) = |, Lwddos (2-5) 


and Bartlett’s formula (cf. also Tukey and Hamming, 1949) for smoothing the periodogram 
I(A) is n’—1 | s| 
gxA)=2 > ( - ma C’, cos (8, A), (2-6) 


, 3 ] 8 : : . ‘ 
where C, = cyt =}. Che resolving power of this last procedure increases with n’, 


, 


and the smoothing with m = n/n’ (where v is the total length of the series), fluctuations 
being of order 1/,/m. Grenander subsequently examined a whole class of estimates containing 
the above two, by introducing a general weighting factor, u,(A), in the formula 


JolA) = 2 3. u,(A)C,008 (2A), (2-7) 

or in alternative form (for suitable conditions on u,(A)) 
aA) = | Te) w,(w)de (2-8) 
where the positive weighting function w,(w) is expressible in terms of u,(A) by the formula 
w,(w) = = Swf) cos (sA) cos (sw). (2-9) 


The estimate g,{A) has the asymptotic sampling properties 


E(go(A)}~ “g(t) w,(o) de, (2-10) 


var o(A)}~= | 90) wh(w) de. (2-11) 














11) 
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The relation of Daniell’s estimate gp(A) in (2-5) to the general formula (2-8) is obvious; the 
other estimate in (2-6) has for large n the effective weighting factor 


_lsl 
uA) = , Fiacttos i (2-12) 


0 (|s|>n’), 


and the weighting function corresponding to w,(A) in (2-12) is 





_ 1 [sin*{}n'(wo—A)} | sin? {gn'(w+A)} ' 
wr) = San! | sin?{}(o—A)} ~ sin? {¥(w+A)} |” cy 
1 fAth 
Clearly E{gp(A)} ~ | g(w) dw 
A-h 
~g(A) for small A only, (2-14) 
and var {gp(A) ~— g(a) for small h. (2-15) 
Grenander has given also the results 
E{gp(A)}~g(A) (2-16) 
d 2 
* 594A) (A+0), 
var {g,(A)}~ (2-17) 


——_ 9% Na 
gma) (A=0). 


3. UNCERTAINTY RELATIONS 


When Grenander investigated further the sampling properties of g,(A), he imposed the 
condition 


ea(w) do = 1 (3-1) 
0 
as a ‘sort of asymptotic unbiasedness’, but this condition (which is satisfied by the 


weighting function for gp(A), even for h not small) appears unsatisfactory, and we shall use 
the correct condition for asymptotic unbiasedness from formula (2-10), viz. 


i) “g(w) wx(w)do~9(A), (3-2) 


With this condition the formula for var {g,(A)} in (2-11) yields 


" " 2 
Fvar (9olA)}~ | gw) wh(w)dor> =| ["g(w),(w) do 


l 
py 
a9 (A), 


whence asymptotically var {gg(A)} > = 9A). (3-3) 


Moreover, the condition for equality is reached if 


w,(@) = 9(A)/[7g(w)). (3-4) 


10 Biom. 42 
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This ‘ideal’ estimate thus has weighting function inversely proportional to g(w),} not g*(w), 
as Grenander obtained with condition (3-1). However, it is still entirely theoretical in 
character, for it assumes a complete knowledge of the unknown spectral function g(w). 
Furthermore, for different values of A, the weighting function multiplying the stochastic 
quantity J(w) remains inversely proportional to g(w), and all the corresponding estimates 
,.(A) are stochastically equivalent. 

In an attempt to overcome such difficultues, Grenander made use of a measure of 
‘resolvability’, defined in general by the ‘standard deviation’ of w for the weighting 
function w,(w). As w,(w) no longer necessarily satisfies the ‘normalization’ condition (3-1), 
this measure is modified here to 


wri [o-ayrw,ayao / [Festoya’, (3-5) 


(note also that the deviation w—A is used, whether or not the ‘mean’ w is exactly A). It is 
further generalized to 


” " ir 
UY = I Jo—A rw) da | | w,(w) do] (r >), (3-6) 
0 0 
mainly in order to stress the rather arbitrary character of such measures. In terms of 
UY and ‘ 4c 
y= ["9%0) wh(w)do~ F-var (9o(A)}, (3-7) 
we may now generalize Grenander’s ‘uncertainty principle’, connecting U{? and U, by the 


following modification of his argument. 
From the Tchebychev inequality 


Q{|o-A] <29}> (1-5). 


where Q denotes ‘ probability’ in terms of the formal frequency function 
qo) =w,(w)/ ww), 


n A+20 1 pa+2uy 2 
i) Fd baal . wo? q*(w) dw > Tm | q() a] 


4U2 J x20 


> aiid 1 A 
4Uy | | : 
Let A, B be the upper and lower bounds of g(w) in (0,7), these being assumed (respectively) 


finite and non-zero. Then . 
U,> B| w}(w) dw 
0 





> 


mala 


409 |'— > 


where t= | "orw,(o)de. (3-8) 
0 


+ The ideal weighting of I(w) by a function varying as 1/g(w) is in entire accord with the asymptotic 
transformation to a uniform spectrum often desirable in spectral analysis (see, for example, Bartlett, 
1954). 
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7 
Now Al> | w,(w) g(w) dw ~ g(A), 
0 
hence final u=u?u,> gayle li-1Pls0 
whence finally = UY’ U, > }g( 92 — = ks 3 (3-9) 


which shows that the product U of the two quantities UY and U, has a positive minimum. 
The condition of asymptotic unbiasedness which we have imposed could perhaps be 
waived by considering the mean-square deviation of gg{A) from g(A); this would, however, 
no longer give a criterion like U, asymptotically independent of n, so that it seems more 
convenient to adopt the conditions above. ; 


4. ASYMPTOTIC EFFICIENCIES OF 9p(A) AND gp(A) 
From (3-9) we may define a quantity 
W=U/g?(A) (4:1) 

as a measure of uncertainty depending on the form of the estimate g,(A). and on the choice 
of r in UY’. We compare g,(A) and g,(A) by this criterion. The ‘ideal’ estimate (for r = 2 
with the weighting function given in (3-4) is modified to 

w,(w) = g(A)/[2hg()] (4-2) 
if restricted to the interval (A—h,A +h). For small h, g(w) ~ g(A), so that gp(A) approximates 
to the ‘ideal’ estimate if the latter is so restricted. As, for gp(A) with small h, 








UP.~hir +1), Us~ AgXAdjh, (4:3) 
we have W~kr+1)-™. (4-4) 
Table 1 
A=}4n A=\n 
h: ayn 0-290 h: ym 0-293 
ay 0-292 ln 0-306 
wn 0-301 An 0-314 
#0 0-311 2 0-321 
lm 0-322 kn =—0-322 














A numerical calculation (Table 1) was made of W for the restricted ‘ideal’ estimate 
(r = 2) for varying / in the particular case of the spectrum of M. G. Kendall's artificial 
series I (Kendall, 1946). As, for small h, W > 0-289 (for all A within the total range), the above 
results show that W is smallest, in this particular case at least, as h-> 0, for the above class 
of estimate. More generally, for small h, it is readily shown that 


: I 2ch? " 
Wu = 2/3 (1 + 5 ) ‘ (4 5) 
ear a. eM). y Ae : d?g(A) 16 


so that whether W increases or decreases as h increases from zero depends on the sign of c; 
it will, however, increase in the neighbourhood of a maximum of g(A). 
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For g,(A) it can be shown (e.g. by the use of (2-13)) that 


wer~ 5 | 


1 





log cos 4A + Nog sin $A+ “log 2 | 


(4-7) 


2 

wo. ’ 

UT ma log n’, | 
[UP] ~ 1/[(n’y cos (4ar)T(2—r)] (O<r<1). 

Making use of the result (2-17) (for A+0), we arrive at the comparison given in Table 2 of 

gp(A) and g,(A) in terms of W. Asymptotically, g,(A) is thus superior to g,(A) if the criterion 


W is taken for r = 2 (or 1), but the generalization of the resolvability measure U, emphasized 
its arbitrary character, and the ‘efficiencies’ of the two estimates cross over as r decreases. 








Table 2 
r gp(A) gp(A) 
2 0-289 0-125 Jn’ (A=4n) 
1 0-250 0-0675 log n’ 
j 0-222 0-270 
4 0-211 0-222 
0-1 0-193 0-177 
0+ 0-184 0-162 

















5. INVESTIGATION OF THE THEORETICAL OPTIMUM WEIGHTING FUNCTION 


The replacement of U, by U¥U, as a criterion of efficiency naturally complicates the 
investigation of the ‘ideal’ weighting function, but while this will still of course depend on 
a knowledge of the theoretical spectrum, it seemed worth while investigating the form of 
the optimum function if this criterion were adopted. We examine the case r = 2. Let 
w,(w) = v?(w), where v(w) is real, be the optimum function, and consider the slightly modified 
function »(w)+7V(w), where 7 is small and V(w) arbitrary. Denote the change in U, U, by 


6(U, U,). Now from (3-8) U;,, = [(kp— 2Ak, + A2k)/ko]*. (51) 
and, to the first order of 7, d(k,) = 2yK,,, 

where K,= [foro V(w) dw. (5-2) 
Hence 8(U,) = n{(K,— 2AK, +A*K5)/(G ko) — Ko U,/ko}- 

Also 8(U,) = 49 { go) %(w) V(w)dw = 4yL, say. 


Thus finally a 2K \U, 

6(U, U,) = - [ea Nhe — K,U,U, + 4kgU, L| (5°3) 
0 1 

As v(w) is assumed to be the optimum function, the coefficient of 7 should vanish. This 

implies that the expression within the square brackets in (5-3) should vanish for any appro- 


priate V(w). The condition of asymptotic unbiasedness gives the condition 


[ovo oo V(w)dw = 0, (5-4) 
0 
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which will be satisfied if we take in particular 


_few—a)  ew—B) 
Vio)ders a) ayola) a 


0 (x<0), 
where e(x) = 

1 (#20). 
The expression in square brackets in (5-3) now becomes 
U, ‘ —AP_(f- rn dh Fe evar 
U; **Ig(a) g(B) 








hy 2 “fe 2 
g(a) 9(f) + 4h U,[ g(x) v(x) — g(B) v*(B)), 


and, as this should be zero for arbitrary «, 6, we obtain 
U,(a-A??_U,U, 
U, gle) — g(a) 
_- 4,8 in | 
g(a) g(a) Uy |’ 
which determines the optimum form (if this exists in the above sense). It depends of course, 
as anticipated, on the unknown theoretical spectral function g(w). In (5-5), 


B = 4U,/ko, (5-6) 
and the two formulae for U, and U, give, with the unbiasedness condition, three further 


relations for A, B, U, and U,. If these yield an appropriate solution, the minimum U, U, 
is then also calculable. 








+ 4k, U, g(a) v?(«) = constant, 


or v?(a) 





(5:5) 


6. EQUATIONS FOR THE OPTIMUM FUNCTION IN THE CASE OF THE 
SECOND-ORDER AUTOREGRESSIVE PROCESS 


The determination of the coefficients in (5-5) is unfortunately extremely laborious, and 
sometimes impossible, as was experienced in the particular case of the second-order 
autoregressive process Kya + 0Xy,.+5X, = Yue (6:1) 
For (6-1) we have g(A) = 270%, f(A) = 6/(a+ B cosA+y cos 2A), (6-2) 


where a = 1+a?+6?, 8 = 2a(1+6), y = 26, d = 2c? (see Bartlett, 1950). Define 
aig) | w"g-"(w)dw, b™ -/ (w—A)"™g-"(w) do, 
0 0 


which are evaluable for the g(A) of (6-2). Then it will be found that (5-6) yields the equation 


7A? +2A(a B—b®C) +a B?— 2 BC +.b@C? = 4Blo, (6-3) 
where C = B/U?, and Aa? + Ba? — Cb? = kp. (6-4) 
The relation for U, give Ab? + Bb? — CbY = k, B/C, (6:5) 


and the unbiasedness condition gives 
An + Ba? — Cb? = g(A). (6-6) 


Substitution of k, in (6-3) and (6-5) gives, with the aid of (6-6), two equations of second 
degree in A and B. A being found in terms of B from one of them, a biquadratic equation in 
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A results from the other. An examination of an actual numerical case of (6-2) revealed that 
no solution in the admissible ranges for the unknowns is always possible; in any case, if 
we recall that the solutions even when they exist are only available when g(A) is known, these 
equations seem unlikely to be used in practice. 


7. THEORETICAL GAIN IN EFFICIENCY WITH WEIGHTING FUNCTION IN RESTRICTED RANGE 


In view of the difficulty of handling the optimum weighting function (on the basis of the 
criterion U), it may be advisable to demonstrate that the optimum function on the basis 
of the criterion U, but restricted to the small interval A—h, A+h, viz. 


w(A) = g(A)/[2hg(w)] ~ 1/2h, (7-1) 


may at least be improved on the basis of the criterion U. As, for small h, g(w) ~g(A) to the 
first order, the weighting function in (5-5) may be written to the same degree of approxi- 
mation 


A’ + B'(w—A/?’, 
and it is easily found from the further relations for A, B, U, and U, that 
3 (w—A)? 
wwlo)~ Fo (1 ‘he "); (7-2) 
Moreover, for (7-2), W =3,/5/25 = 0-268. 


This is only a small improvement over the value 0-289 for W with the Daniell estimate 
Jp(A), but it will be noticed that this approximate weighting function is independent of the 
unknown spectral function g(A), and hence also usable in practice. As the approximate bias 
in the resulting estimate gy(A) is reduced from }h*d*g(A)/dA? to jsh?d*g(A)/dA?, the use of 
Jw(A) in place of gp(A) might sometimes be preferable when the periodogram I(w) is 
available. 

The merits of the other estimate considered, g,(A), rest in its asymptotic unbiasedness 
and its convenience of calculation from the first n’—1 sample autocovariances (or auto- 
correlations). From Table 2 we saw that its efficiency depended on what measure of 
resolvability was used. 
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THE AUTOCORRELATION FUNCTION AND THE 
SPECTRAL DENSITY FUNCTION 


By J. WISE 
London School of Economics 


INTRODUCTION AND SUMMARY 


The relationship existing between the autocorrelation function and the spectral density 
function of a stationary, purely non-deterministic stochastic process is well known. Discus- 
sions of the relationship have been given by Wiener (1930) and Khintchine (1934), in the 
case of the continuous process, and by Wold (1954, Chapter 2, §17, pp. 66-75), and Doob 
(1953, Chapter 10, §§3 and 4, pp. 473-86) for the discrete process. The treatment in the 
present paper is confined to the discrete process.* 

The relationship referred to also provides a fundamental connexion between the auto- 
correlation matrix of a process and the corresponding spectral density function of the 
process. Some rather ingenious manipulations on the matrix representation of the spectral 
function have been carried out by Whittle (1951, Chapter 4, § 2, pp. 35-6, equations (4°272) 
and (4:276)). However, Whittle’s work contains some inaccuracies. More important still. 
‘exact’ results exist for those cases in which Whittle gave ‘asymptotic’ solutions. 

In the course of the derivations of these exact relationships, the author uses methods 
which appear to be new—though bearing some relationship to André’s method of solving 
linear difference equations (see C. Jordan, 1947, pp. 587-99)—and are capable of wide 
application in the theory of discrete linear stochastic processes. The use of exact relation- 
ships dves, moreover, lead to a considerable improvement in clarity and rigour. 

In view of the fundamental exact relationships existing between the autocovariance 
matrix and the spectral density function, the former might appropriately—in the case of 
the stationary process—be termed the spectral density matrix of the process. Traditional 
terminology has, however, been adhered to in the remainder of the paper. 

The main topics treated below may be classified under the following headings. 


(a) The circular process 
(i) The exact relationship existing between the autocovariance matrix and the spectral 
density function. 
(ii) The latent roots of the autocovariance matrix. 
(iii) The exact relationship between the canonical form of the autocovariance matrix 
and the spectral density function. 


(b) The non-circular process 


(i) The exact relationship corresponding to (a) (i) above. 
(ii) The exact inversion of non-circular autocovariance matrices. 


* This discrete process may, however, be a sample realization drawn from a continuous process. 
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(a) THE CIRCULAR PROCESS 
Definitions 
The vector of N random variables, given by x’ = {xy, xy_,,....2,}, is assumed to have the 
following distributional properties: 


Ex = 0, (1) 
Et=o* {¢=1,2,...,N}, (2) 
Ex,%,,,=o07p, {s = 1,2,...,N}, (3) 
where Pn+~ = Pn-L = PL: 


The sequence of values (,, 2, ..., Py is termed the autocorrelation function of the process. 


Exx’ =V_ (the autocovariance matrix) 








Tl Py Pe Ps + Pe Pr] 
Pi 1l Py Pe ++ Ps Po 
=O lp, py 1 py + Pa Pal» (4) 
1P: Pe Ps Pa + Px TY 
an N x N matrix. 
Properties 


The autocovariance matrix, V, may, from (4), be expressed in the following form: 
V = o7 {1+p,(W+ W-") + p.(W? + W-*) +... + paw_»(WIO-9 + W-20-9)} (5) 
where N is a positive odd integer, or, alternatively, 
= o7{1+p,(W + W-") +p,(W2+ W-*) +... + 403y(WiY + W-*)}, 


where N is a positive even integer. For all values of NV, I denotes the N x N identity matrix. 
and W denotes the N x N circulant definition of the auxiliary identity matrix, that is to say, 


ree” ae ye 4 
BR Pe Re eee 
W = 0 0 0 0 l eee 0 . (6) 
ger 
1 0 ao 








Since the matrices (W4+ W-~“) for (L = 1, 2,...,N) are commutative, they may all be 
simultaneously reduced to canonical form by a single transformation. 
Thus, if we consider the matrix L, satisfying the property 


L'L =I, (7) 
L'(W£+W-*)L = diag {wf +07 %,...,0h +o7"}, (8) 
where W,, We, ...,@y are the N values of the roots of the equation 
yV—1=0, (9) 
then L'VL = diag {v,, v2, Vg, ..., Vy} 


provides the canonical form of the autocovariance matrix V. 
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Furthermore, we have 
2rsL 
N 





wh +w; = 2cos (s = 1,2,...,.N; L = 1,2,...,N). (10) 


The latent roots of V are thus given by 


Vy = O7 {1 + p(w, + W5*) + p(Ws +5”) +... + Pyy_y(OK4—Y +@F#N—)} (8 = 1,2,..., N) 


(N— v—| ( 


278 4ns 
= o*{1+-2p,008-ne W + 2p,.008 + .. - + 2pyy_y COS W ¢= 1,2,...,N) 


for N odd, or by 
V, = 07 {1 +p,(W,+ Wz *) + po(W3 +57) +... + Fpyn(OPX +wZ5*)} (s = 1,2,...,.N) 
= 0°{1+2p, 008-4 V * + 2pqcos—Te +. . + pyy coss| (¢ = 1,2, ...,N) (11) 


for N even. 


The relationships given by (5) and (11) are the correct restatement of similar forms given 
by Whittle (1951. Chapter 4. §2, pp. 35-6), which are incorrect even as approximations in 
the non-circular case. 

General rules for the multiplicities of the latent roots of V have been incorrectly stated 
by both Whittle (1951, Chapter 4, §3, p. 37) and Watson (1951*, Chapter 2, § 5, p. 50). 

Results correctly obtained by R. L. Anderson (1942, §5a, pp. 7-8), on the multiplicities 
in the values of the terms in the sequence, 





27L 4nL 62 L 
cos —— 


Vv” cos Vv cos vy cos 277L, (12) 


may, however, be used to derive valid statements relating to the multiplicities in the latent 
roots of V. 

Let N = nk and L = lk, where n, / and k are positive integers, n and / being prime to one 
another. Two cases may then be distinguished. 

(i) For n odd, there are 4(n—1) terms in the series (12) occurring with 2k-fold multi- 
plicities, together with the value unity occurring with k-fold multiplicity. 

(ii) For » even, there are 4(m— 2) terms in the series (12) occurring with 2k-fold multi- 
plicities, together with each of the values + 1 occurring with k-fold multiplicities. For the 
derivation of these results the reader is referred to R. L. Anderson’s paper. 

Although it is impossible to enumerate here all the possibilities in detail, a ‘rule’ which 
the author has found useful in applications of the theory to sampling problems will be given 
relating to the multiplicities in the latent roots of V. This ‘rule’ is adequate for most of the 
cases which occur in practice. The ‘rule’ is not, however, a theorem, since exceptions to it 
occur when certain relationships exist between the elements of V. Care must therefore be 
taken when it is applied. This ‘rule’ is as follows: 

If p, +0 for at least one value of L satisfying the condition N = nk and L = lk, n andl 
being prime to one another, then the multiplicities in the latent roots will be at most 2k-fold. 

Whittle’s assertion that if N is odd there are }(N — 1) pairs of distinct roots, with a single 
root greater than or less than all of these others; and if N is even there are $(N — 2) pairs of 


* Watson states here that ‘real symmetric circulant matrices have at most one odd latent root, the 
rest being pairwise equal’. That is, of course, incorrect. 
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distinct roots with one root greater than and one root less than all of the other roots, is 
incorrect. 

An example occurring in the literature on time-series analysis, demonstrating a practical 
instance in which the ‘rule’ stated above yields the correct result, whereas Whittle’s 
assertion breaks down, is the lag — L definition of the Markoff process, given by 


%—pX_~=& (t= 1,2,...,N), (13) 
where xy,5 = Xs, N and L not being prime to one another. 


The spectral function of a circular process 


The spectral density function of the circular process possessing the distributional pro- 
perties (1), (2) and (3) is of the form* 


(N-1)9\ | 


ate ‘cis ; ‘ . ‘ 
1(0) = 0% \1 + 2p, 0088 + 2p, 008 20 + ... + 2pyy_y 008 “ott \o = We ey 
where N is odd; or of the form 
27 4 
v(@) = 07 {1 + 2p, cos 8 + 2p, cos 20 + ... + Pyy cos 4NO} 10 = Ss pe ey Om » ~ aaa 


where N is even. 

The latent roots of V, given by (11), are thus the values of the spectral density function, 
v(9), taken at constant intervals of 27/N in the values of @. 

These roots may thus be regarded as being generated by the spectral density function of 
the process. 

The results obtained so far are, of course. exact for finite V. By considering the limiting 
values of the roots as N tends to infinity, asymptotic results may be obtained. 


The autoregressive-moving average process 


The discrete circular linear process of the autoregressive-moving average type may be 
written in the form 


Ly + hy Uy t+... + yy _p = + Py G_y +... + Bren {t= 1,2,...,V}, (15) 


where Xy.5 = Xg, Evis = Eg, and 


ll 


Ke,=0 {s=1,2,...,N}, 
Ee =o? {s=1.2,...,N}, (16) 
Ee,e,=0 {s+}. 


* Any discrete stationary purely non-deterministiec process may be written in the form 


£e= +2, Mart Pot2t «++ 
where the summation extends to infinity. 
The spectral density function of this process is defined as 
v(9) = (1+ Pye? + fie? +...) (1+ f,e- + f,e-79 +...) var 7 
for both circular and non-circular definitions of the process, for samples consisting of any number of 
consecutive observations. The relation (14) is obtained by expanding v(@), thus defined, as a Fourier 
series. 

















16) 


rr of 
irier 
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In the matrix notation the process becomes 
(I+a,W+...+a,W*)x = (1+/,W+...+8,W*)e, (17) 
where KX’ = {%y, 2y_y,Zy_gs ---s qT} 
and e’ = {ey, €y_1,En_a) --+» Eg» €}- 
Solving (17) for x, we obtain 
x = (I+a,W+...+a,W*)1(1+8,W+...+8,W*)e. (18) 
Thus, from (18), we obtain Ex =0, (19) 
and 
Exx’ =V_ (the autocovariance matrix) 
= o°(I1+a,W+...+a,W*)-1 (1+ 8,W+...+8,W") (1+ 2,W+...+2,W-) 
x (I+a,W-?+...+0,W~*)-1. (20) 
The latent roots of V are therefore given by 
Sel renrmegsie re: eres awe em 


2ins/N 2inhs/N —2ins/N —2inhs/N 
2(1+Aie +...+f,e )(1+ Aye +. that ) 
~ O (Lay Pm + 1 eBMRSIN) (1 + oe, e-BIMIN G4, @-BimbalN) 





(s = 1,2,...,N) 





(e = 1,2,...,N). 


(21) 
The spectral density function of the circulant process (15) may be written as 





MO) = OT tater tem) O<Oean 
=g(e) (let ussay) (0<9<2z). (22) 
It follows from (20) and (22) that V =9(W), (23) 
and, in addition, v, = 9(w,) = g(e**"") (s = 1,2,...,N). (24) 


These results are exact, and are fundamental to the exact treatment of circular processes. 

An asymptotic relationship between V (in the non-circular case, to be further discussed 
in § (6) of this paper) and g(W) has been given by Whittle (1951). It is not at all clear in what 
sense this asymptotic relationship is valid, unless the exact form is used. Whittle (1952, 
Theorem 1, (6) and (7), pp. 49-50), and Wold (1953, Chapter 11, §2, Theorem 1) use the 
following interpretation of the asymptotic relationship: 


x’Vx 


vine X'ghW) vile 


for almost all realizations of the process. This result, though valid, is difficult to work with 
rigorously. Whittle does not adhere to it in any of his derivations, and for a rigorous treat- 
ment it is preferable to work throughout with exact relationships. The latter are no more 


cumbersome than the asymptotic forms, and their use may lead to the avoidance of 
substantial errors. 
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This is abundantly illustrated in the case of the first-order autoregressive process in the 
circular case. The largest latent root of the autocorrelation matrix for «, <0 is given by 
ph in le ae oy hich differs sub- 
1+(—a)Yl+o, L+a,’ 
stantially from the exact value when N is small and —, is close to unity. In fact, as — a, 
tends to unity, the exact form tends in value to N, while the asymptotic form tends to 
infinity. 

A final point which should be made on the subject of the latent roots of circulant variance 
matrices is that in the derivation of the values of, and of the multiplicities in these roots, 
it is usually more convenient to consider the spectral density function in the form (22) than 
in the form (14). 








. The corresponding asymptotic form of this is 


(6) THE NON-CIRCULAR PROCESS 


Whittle (1951, Chapter 4, §2, relations (4-253), (4-262) and (4-28), pp. 34-6) used the 
‘asymptotic’ form of (23) to obtain a simple method for the ‘approximate’ inversion of V 
in the non-circular case. For many purposes, approximate inversion is not adequate, and 
is not in any case necessary. It will be shown below that a very simple method of exact 
inversion is available. Whittle (1951, Chapter 4, §2, relations (4-242)-(4-252), pp. 33-4) 
does give a cumbersome method of exact inversion based on the auxiliary identity matrix, 
denoted by U, where 








10 ko Q 6Qic s weet A 
0 k ee pv 
GiB: Oe-d-ratec © 
U=!: ge. (26) 
ee SS Teer = 
Be O90: AG 24.04 


an N x N matrix. Unfortunately Whittle’s method appears to give the required result only 
in the case of the first-order autoregressive process. 

An exact method of inversion will be given below which utilizes a non-circular counter- 
part of (20), the relation between the autocovariance matrix and the spectral density 
function. 

The non-circular discrete linear process, generated by the relationship 


My yyy tone +O pM _p = + PyG_y+... + Brea, (27) 


will be assumed to be semi-infinite. This does not prevent us from deriving results from 
samples consisting of N successive terms of the process. Such results for finite samples 
follow immediately from the case of the semi-infinite process. This is due to the fact that the 
autocovariance function and the spectral density function are both completely independent 
of N, a property which does not hold in the circular case. 

We will assume (27) to represent a purely non-deterministic stationary process. An 
important theorem obtained by Wold (1954, Chapter 2, § 20, pp. 84-9, especially Theorem 7, 
p. 89) states, furthermore, that every semi-infinite, stationary, purely non-deterministic 
process can be expressed in the form (27). 

A condition which is both necessary and sufficient for the stationarity of the process (27) 
can be obtained from a result given by Doob (1953). 





—_ —_-  - -_—_—_—_——_ 











6) 








a 
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The spectral density function of the process (27) is given by 


be 2 + fe" +...+ 8,6) (1+ pie +... +f, e-™) 
(1+a,e%+...+a,e%?) (1+ a,e-% +... +a,¢e-*) 





g(e*) (0<0<2n). 


It may, in passing, be noted that this is the same as (22), given for the circular process. 

According to Doob (1953, Chapter 10, pp. 452-506, especially §10, pp. 501-6), the 
necessary and sufficient condition for a stochastic process with the spectral funciion (22) 
to be stationary in the ‘wide sense’ (Chapter 2, §8, pp. 94-5) is that the equation 


z+ B,2e-14+ ...+8, =0 
shall have no roots outside the unit circle, and that the roots of the equation 
a+a,zk-14 +a, =0 


shall all lie inside the unit circle. The reader is referred to Doob’s* penetrating exposition 
for a more detailed treatment of this topic. 
The relationship (27) may be transcribed as 
(I+a,U+...+a,U*)x = (1+8,U+...+6,U%e, (28) 


where x and € are semi-infinite vectors, and U is the semi-infinite auxiliary identity matrix. 
Solving (28) for x, leads to 


x = (I[+a,U+...+a,U*)1 (I+a,U+...+,U*)e. (29) 
From (29) we deduce Ex =0, (30) 
and V = Exx’ 
= o7(I+a,U+...+a,U*)1(1+8,U+...+, 0") (1+8,U'+...+6,U0") 
x (I+a,U'+...+a,U0”)-1, (31) 


which is thus the autocovariance matrix of the process (27). This is the exact, non-circular 
counterpart to the relationships (20) and (23), given above for the case of the circular process. 
Relationship (31) bears a close affinity to the ‘autocovariance generating function’ given 
by 
g(L+Azt+...+f,2")(1+h,21+...+ 8,2) 


q 32 
= 7 (140,24... +02") (140,224... +0,2-*) (32) 





However, (31) is much more useful. From it can be obtained directly (i) the autocovariance 
function of the process (27), this being the property (31) shares with (32); (ii) the inverse 
of V in its exact form; (iii) the integral and rational powers of V and V-". 

Thus we have, from the elementary rule on the inversion of a product of several matrices, 


oV-! = (I+a,U'+...+a,U0")(1+8,U'+...+2,U0")- 
x (1+ f,U+...+ 8,0") (I+a,U+...+0,U*), (33) 
which is no more difficult to evaluate exactly than the matrix product in (31), which is the 
autocovariance matrix itself. é 


It is apparent from (31) and (33) that although V is a Laurent matrix, V— is not, a 
property already noted by Whittle. 


* See also Wise (1955, Chapter III) for a further analysis of this problem. 
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The autoregressive process 


To illustrate the application of the above results to problems of importance which have not 
been solved in the literature even in special cases, the inverse of the autocovariance matrix 
of the autoregressive process generated by the relation 


Uy hy Ly_y H.-H Dp = & (34) 
will now be derived for the semi-infinite process. 
From this result the inverses of the autocovariance matrices of N successive observations 


from the process (34) are then given in full for k = 1, k = 2 and k = 3. 
From (33) we deduce for the special case (34) the result 


oV- = (I+24,U'~...+a,U")(I+a,U+...+a,U*). (35) 
Taking & = 1, 2 and 3 respectively, we deduce at once from (35) the matrices 


i) &= TI: 














ee peak aes. * 
ie Sy et “ayer 0 0 
0 «2 l+af? ... 0 0 
oV-1=]. ; 2. = . (36) 
0 0 0 . LQ @ 
LO 0 ar ae 14 
an N x N matrix. 
(ii) k = 2: 
ri ay Ae 0 0 07 
a, 1l+a? ty + Hy Oy Ms 0 0 
2 2 
Bo Ay tOy%, L+aj+ag a,+a,a, 0 0 
aV-1=)] 0 Xe hy +O, t, 1+at+az 0 0 (37) 
0 0 0 0 . L+az ay 
ma 0 0 0 a ay Ey 
an N x N matrix. 
(iii) k = 3: 
«= ay to as 0 07 
a, 1l+aj hy + Oy he As +H, ees 0 0 
he A, +H, Ay 1+ aF+ ag Hy + Ay Ag+ AeA, 0 0 
rVvi= Bg Ag+ Hy Oy ty + Ay Oy + Oy cty l+az+az+az ... 0 0 (38) 
0 0 0 0 we L+az ay, 
| 0 0 0 0 a ay ie 








The result (36) confirms that obtained for k = 1 by Cochrane & Orcutt (1949, equation (6-14), 
p. 57). However, the results (37) and (38) for k = 2 and k = 3 respectively do not appear 
to have been given previously in the literature.* So far as the author is aware, (35) is new, 
and the inverses of the autocovariance matrices for k = 4, 5,6, ..., etc., can be written down 
immediately from it, rendering ‘approximate’ methods of inversion unnecessary. 


* Note, however, Champernowne (1948, p. 206, equation 3-5), which gives the inverse implicitly 
for the case of the autoregressive process. 
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SAMPLING PROPERTIES OF LOCAL STATISTICS IN STATIONARY 
STOCHASTIC SERIES 


By G. H. JOWETT 
University of Sheffield 


1. INTRODUCTION 


The sampling moments of large-sample statistics calculated from stationary time series 
usually require for their evaluation the summation of long series of cumulants or products 
of cumulants of the series, or for normal series the summation of long series of autocorrela- 
tions or products of autocorrelations (cf. Bartlett, 1946). These series usually converge only 
as the cumulants in question tend to zero, which is often in practice so slow a process as to 
make their evaluation a matter of considerable difficulty. There is, however, a class of 
statistics which may be called local in that they are dependent on short-term comparisons 
of terms in the series, and for these the summations are of such a kind as to converge as 
certain finite differences of the cumulants tend to zero; in practice this often happens much 
more quickly than the tending to zero of the cumulants themselves, and the corresponding 
formulae are accordingly much easier to evaluate in practice. 

A simple illustration of this difference is provided by a comparison between the simplest 
statistic which depends on local comparisons and the simplest statistic which does not. 
It is advantageous to give this illustration before attempting a general discussion, so as 
to give a simple outline of the argument and the ideas involved. 

Suppose 2, Xg, ..., Van_1, Vay, to be a set of 2n evenly spaced observations from a stationary 
time series with mean , variance o?, and autocorrelation function p,. The sampling variance 
of the statistic 


U =n" [(2, —Xq) + (%3 —Xq) +... + (Lon_1 — Zen)] (1) 
is determined as follows: 
varU = ny P raed [(%o¢_-1 — %a;), (oz_4 — %a5)] (2) 
i=1j= 
n n 
=n 2 Py O( — Pos_p—1 + 2Pai—p — Poi—p+1) (3) 
i=1j= 
+(n— 1) | 8 | p 
aa 8 ( 1) (—A"o*ps,), (4) 
s=—(n—1) n 


where A” is the second central difference operator in Comrie’s notation, and refers to the 
suffix of p. The need for this statistic and its sampling variance might arise, for example, in 
the estimation of the treatment effect in a systematic experiment having the pattern 


TCTC...7T7C (f=treatment; C=control), (5) 


or, alternatively, in the statistical control of the difference between the means of two 
interpenetrating systematic samples (Jowett, 19555). 
On the other hand, the sampling variance of the simple arithmetic mean 


T = (n-) (4, + 23+... +%en_1) (6) 














6) 
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is easily shown to be given by the formula 


var T owe ‘$s (.-!2!) o? (7) 
 henm—1) n fe 

In many applications, as |s| increases the autocorrelation function tends to become 
‘locally linear’, i.e. approximately linear over intervals of width equal, say, to twice the 
interval between successive terms of the series; as | s| increases further it gradually tends 
to zero. Thus A’p,, may attain a sufficient degree of smallness to be neglected in com- 
puting the approximate value of the sum in (4) at a much earlier stage in the increase of 
|s| than that for which p,, can be neglected in computing the approximate value of the 
sum in (7). 

Now experience and a priori considerations (cf. Jowett, 1953) suggest that for many 
series observed in practice the graph of the serial variation function 


6, = E[}(x;—2;45)"] (8) 


is zero for s = 0, rises steeply as s increases with gradually decreasing gradient and curvature, 
and ultimately flattens out when s becomes large, rather like the graph of the function 


y = A(l—e~*) 
for positive values of the constants A and k. 
Since 6, = o7(1—p,), (9) 
it follows that —A’o*p, = A"6,, (10) 


and hence that the graph of op, will have the same shape but inverted. The serial variation 


_ function has the advantage that for short lags it is effectively independent of long-term 


variation in the series and requires no reference to the series mean, qualities which make 
it eminently suitable for use in the study of local variation. If for s > s)(e) the serial variation 
function has become approximately linear to such an extent that A’6,, is less than some 
predetermined small quantity ¢, we deduce from (4) that 


1 +@=-) ( |a| 
varU=- > (1-E)ara.+R, (11) 
N g=—(s,—1) n 
Qe (n-1) | s| 
ee A ce 12 
where |R|<— ha con 9 (12) 


8=S8o 


Hence the error in the estimate of var U committed by truncating the summation at + (8 )— 1) 
is at worst of the same order of magnitude as the departure from local linearity of 6, over the 
rest of the range, which is usually the same as that of the first term omitted from the sum- 
mation, since | A”d,, | will usually tend to decrease monotonically for s > 8». 

Since we are usually obliged to infer the stochastic properties from one or at most a very 
few realizations of the series itself, it will often be true that var U is much easier to estimate 
than var 7’; the need to estimate the form of p, for large lags, often a matter of considerable 
difficulty, is not present, and the formula (4) may be computed approximately (provided, 
of course, that ¢€ = o(n~!) and s)<n) using serial variation statistics with relatively short 
lags, which are usually readily available. 

It will be observed that U is a function of the set of differences x, —2,, %;—%4, .-., Which 


might be described as local comparisons, since they are constructed from adjacent terms of 
Ir Biom. 42 
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the series. It is this which gives rise to the differencing in (4), which makes the formula 
independent of o?, and hence expressible entirely in terms of 6,, and which enables var U 
to be estimated in many circumstances where it would be impossible to estimate var 7. 
A corresponding property has been observed in other statistics, such as the mean of a syste- 
matics ample (cf. Jowett, 1952) or a trend-reduced regression coefficient (Jowett, 1955a) 
which are functions of local comparisons alone, and suggests that the sampling properties 
of statistics which are built up entirely from local comparisons are essentially determined 
by the local variational properties (e.g. 6, for small s) of the series themselves, and are 
effectively independent of the iong-term variational properties of the series which are so 
difficult to measure in practice. The rest of this paper will be concerned with establishing 
this principle, and with illustrating it by some specific formulae. 


2. SAMPLING MOMENTS OF LOCAL STATISTICS IN STATIONARY NORMAL SERIES 


We shall assume a reference space S with position vector t, and shall use the symbol t to 
denote a displacement vector, e.g. t, — t,. This is a generalization of the usual time axis in 
one dimension. A set of stochastic variables x,(t),2,(t), ... (which may be regarded as com- 
ponents of a vector random function) with respective means /,,//.,... and variances 
o%, o%, ..., isdefined at the points of S; in practice the definition is often at a lattice of points 
only, but for generality we shall assume definition at all points of S. These variables will be 
taken to form stationary series, i.e. their probability parameters to be invariant under 
translation in S. 

We shall be concerned with seminvariant local linear functions (s.1.1.f.’s) of these variables 
defined as follows: 


Las | x,(t)dL,(t), | dL,(t) = 0. (13) 
8 8 
The integration is to be understood in the Stieltjes sense, so that 


Leg =EUt,2;(t,) + | z,(t) 1**(t) dt, (14) 


8 

where /%, is zero except at a finite number of points of S, and /**(t) is integrable in the ordinary 
Riemann sense. These linear functions of x(t) we called seminvariant because they are 
unchanged by the addition of any constant to x(t); to justify the adjective local, IX, and 
l**(t) will be taken as zero outside a minimal sphere in S of diameter a, bounded above by 
some fixed value a. These s.].1.f.’s are generalizations of the simple local comparisons 
implied in § 1. 

It will be assumed that the series are multivariate normal. This assumption may, how- 
ever, be dispensed with when we are concerned only with second moments of linear statistics, 
since the question of normality does not then arise. The assumption of normality implies that 


( [ty [lg -.. #4, + Other terms each having at least one of 4,42... ; 
as a factor (s odd), 


Ex, (t,) xo(t,) ... x(t.) = { Wy Me... gt 2 COV, zat, — tz) COV, 2,(t, — ty)... COV, .,(t, —t,) 
ap. 


+ other terms each having at least one of 4... 4, 
\ as a factor (s even), (15) 





where af... uv is any permutation of /...s such that 


a<fP,y<6é,...,54<v and a<y... <p. (16) 

















15) 


16) 
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If we prefer to work with lag variation parameters defined by 
O,,a(t—) = BF(x,(t)—2,(t))*, (17) 
we have bx,2(t—t’) = $(m,— M;)* + H(07 + 07) — Cov, 2, (t—t’) (18) 


as the relation which permits us to substitute é for cov. In analysis concerned with s.1.Lf.’s, 
the seminvariance usually results in the elimination of the terms involving means and 
variances. For example, 


cov (L,;; D3) = E(L,,L5;) (since E(L,,;) = 0) 


= [{ ade) ie’) ad ,(t) dL g(t) 
e | ) cov,,,(t-t’) dL, (t) dL (t’) 


ln { | ~8,,,(t-t’)dL,(t)aL,(t’). (19) 


The double integration is interpretable in the obvious way in terms of summations and 
Riemann integrations. 

In addition to the assumption of normality, we shall make a further important assump- 
tion about the nature of the lag variation functions (l.v.f.’s), an assumption which is very 
often reasonable in practice (cf. Jowett, 1953). We shall assume that the l.v.f.’s tend to 
become increasingly linear as |7| increases; or, more precisely, that for arbitrary (small) 
é> 0, and for arbitrary c, there is a value ho(e, c) such that for h >hyandh<|t|<h+c,wecan 
find a constant y,; and a constant vector x,;; such that 


6 2;2;(T) = Vig t+ Kaj T+,5(7), (20) 
where | €,4(t) | <e. (21) 


Two s.l.1.f.’s will be described as far apart if their associated minimal spheres touch the 
outside of a sphere of diameter greater than some predetermined distance hy. If we take ho 
as dependent on ¢ as described above, and take c as twice the maximum local diameter a, 
it follows that if L,, L, are far apart, 


cov (Leis Lg;) -||- by 0;(t rs t’) aL, (t) dL ,(t’) 


ice | [vy trey (t—t’) +e,(t-t’)] aL, (t)dL,lt’) 


3 i} I e,(t—t’)dL,(t)dL,(t’), (22) 
and hence that | cov (L.;,L,;)| < e| | dL, (t) i) | dL,(t) | 
8 e 
= O(6). (23) 
On the other hand, if L, and L, are not far apart, cov (L,;,,L,;) is a function of 4,,,,(t) 
over the range O<|t|<hp+2a, (24) 


II-2 
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i.e. in what may be called a (hy + 2a) neighbourhood of t = 0. If hy and a are sufficiently 
small (in practice, hy is often comparable with a in magnitude), the behavicur of 6,, a(t) 
in such a neighbourhood may be interpreted as a local variational property of the series. 

This property of being either dependent on the l.v-f. in a neighbourhood of t = 0 or of 
magnitude O(c) is also true of the cross-moment of several s.1.1.f.’s. For the cross-moment 
of s of these functions, which have zero expectations, we have 


E(Ly;,, Lg, --- Let,) = | | eS | Elx,,(t)x,,(t’) ... 2;,(t°-)] dL, (t) dL,(t’) ...dL,(te-») 


F if s is odd, 
= {ff ries 1)ts > [d.o(t—t’) Sya)(t” i”) Ae G(t?-® — te-D)] 
ap...pv 
| x dL,(t)...dL,(t*—») if s is even, (25) 


(where the suffix (~#) means z;,%,,), since all the terms in the expectation of the product 
in the square bracket which have one or more of y, ... “, a8 factor are annihilated in the 
integration by the seminvariant property of the corresponding L’s. 


The summation in (25) is taken over the set of permutations of 1, 2, ...,s defined in (16). 
Hence if s is even, 


E(Ly;, ... Lgi,) = > cov (Ligi.» Lpig) COV (Lyi, Lig) --- COV (Lyi, Lyi,)- (26) 


Since the right-hand side of (26) is a function of covariances, it follows that E(L,;, Lo;, ... Ls:,) 
is either dependent on the l.v.f. in a neighbourhood of +t = 0 or of magnitude at most of 
order e. 

If the series x, (t),... are made to coincide in sets, and the s.l.1f.’s L,;,,L9;,,--. to 
coincide in sets included in them, the result just established takes the following form: 

Theorem. The expectation of any product of powers of s.1.1.f.’s of any set of concomitant 
stationary stochastic variables for which the s.v.f.’s and the l.v.f.’s have a tendency towards 
linearity with increasing || (as defined by (20) and (21) and the preceding paragraph) 
is a sum of terms which are either of magnitude at most of order € or involve only values of 
the s.v.f.’s and l.v-f.’s in a (hg + 2a) neighbourhood of t = 0. 

It follows that the moments and cumulants of any statistic which is a product of powers 
of s.1.1.f.’s have the same property, since these are expressible as linear functions of products 
of powers: and hence that the moments and cumulants of any statistic which can be 
expressed as a linear function of products of powers has the same property. The theorem 
is thus seen to be very general in character, for most if not all statistics which measure local 
properties and are of interest in practice may be expressed either exactly or approximately 
in this form. The theorem implies, broadly speaking, that the variational properties of 
local statistics depend essentially on local variational properties of the parent series. 


3. ASYMPTOTIC MOMENTS OF LARGE-SAMPLE LOCAL STATISTICS 


Suppose that the statistics L,, ... L,;, ( even and divisible by an integer p) fall into r sets 


of p, namely, .4,....4,. Suppose, moreover, that the product of the statistics falling into 

the set .4@; is denoted by M,. Thus 

M, _ Ly, Lyi, — Lyi,» M, — Lo stipes Lys 2,ipss eee Lay, top? °°? M, = Le p+l,iz=ip+1°** Li, 
(27) 




















) 
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It is easily shown that 
E(M, — E(M,)) (M, — E(M,)) ... (M, — E(M,)) 
= o> Ana (Lai, Lei 2) COV (L, iy L,;,) ... cov (L wip Lviy)s (28) 
where af ... uv is a permutation of the integer / ... s, subject to the conditions (16) and also 
to the condition that no pair of suffixes associated with a covariance included in the formula 
may come from the same set .@. 

Two sets will be said to be far apart if the members of one set are far apart from the 
members of the other. If any set is far apart from all the others, the product-moment (28) 
will be O(c”), since every member of the set has to be associated in a covariance with some 
member of another set. For some of the terms in (28) not to be O(e”), for each set there must 
be at least one other set from which it is not far apart; if this is so, terms which are not O(e”) 
arise, but only when members from such neighbouring sets occur together in the covariance 
brackets. 

Many statistics which are of practical interest are means, or functions of means, of powers 
or products of powers of s.1.1.f.’s of the stochastic variables over a sample region of the space 
S or at a set of sample points, usually evenly spaced. Suppose that we are given sets 
M,, M,, ..., M,,, where n is large (for example, of the same order of magnitude as the number 
of sample points), and where the sets are evenly spread over the sample region in such a way 
that every subregion of volume O(V/n) (i.e. O(1)) contains sets to the number O(1). Then if 


U = (n) (M+ M,+...+M,), (29) 
we have E(U—E(U)y = (n*)E & (M,, — E(M,,)) ..- (Ma, — E(M,,)), (30) 


where a, ... x, is a permutation of r of the integers 1... 7. 

If we assume h<n, the number of terms in (30) which are not O(e”), i.e. which can be 
formed from products of covariances of s.1.1.f.’s which are not far apart, is O(n*"), since the 
number of ways of choosing pairs of sets from 4, 4, ....&,, which are not far apart is of this 
order. Hence E(U-E(U)y — O(n-*) + O(e?), (31) 
the second term on the right-hand side of (31) being justified by the fact that the summation 
in (30) has only n” terms altogether, and is divided by n’. 

If e? = o(n-*), the moment will be dominated by those terms which involve only values 
of the l.v.f. in a (h+ 2a) neighbourhood of t = 0, and will be of magnitude O(n-*). 

This argument may be generalized in a fairly obvious way to show that cross-moments 
of order r about the mean for statistics having the same form as U also have this property 
and this order of magnitude. Thus, since many statistics of useful potential application have 
sampling errors which may be expressed, to a large-sample approximation at least, as linear 
functions of products of sampling errors of statistics having this form, it may be shown that 
the lower moments of these also depend essentially on values of the |.v.f. in a restricted 
neighbourhood of t = 0. 

4. EXAMPLES 


Example 1. Variance of a trend-reduced covariance 


Suppose we havea sample stretch x,y, X2 Yo; ---; 2, y,, from the variation of two concomitant 
series, 2, y. The covariance of the difference between successive terms, which is the simplest 
case of a trend-reduced covariance (cf. Jowett, 1955a), is defined by the equation 


U = (n—1) [(@ — 2g) (Yi — Yo) + --- + (Za-1 — Tn) (Yn-1 — Yn) I- (32) 
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Then 


n—1 
var U = (n— a oh ~ cane ((a5 — B41) (Ys — Yer), (%j — Fj) (Yj — Yj) 
n—-1 
= (n—1)? 2 cov ((%; — jaa), (Xj — Xj41)] COV [(Y; — Yirr)s (Yj — ¥541)] 
+ COV [(%;— X44), (Yy— Yy4a)] COV [(Y; — Yis1), (2; — T41)] 
n—-1 
= (n—1)* ~ 5, zx(t —J) A"8,,(t —j) + [Az (¢ -J)P 
i,j= 
+(n—2) | 8 | : r. a ? 
= (n— bb * (1-4) {A 6,,(8) A 5yy(8) + (Azy(s) }}- (33) 
For independent series, 4,,,(8) is constant for all s, so that 
A"6s,,(8) = 0. (34) 
The dominant terms in the summation are those for small | s |; as | s | increases, and the é’s 
straighten out, the remaining terms will rapidly tend to zero. 


Example 2. The covariance of two serial variation statistics 
The serial variation statistics of lags s, and 8, (8, <s,) calculated from a stretch of series 
21... 2, are defined as follows: 


d(s,) = (n—s,)-* p> $(x;—2;,,,)*. 


irs, 


d(8,) = (n—8)* SX 4x;—2%;,,,)*. (35) 
Then m6 
cov (d(s,), d(s)) ere 
= (n—8,)"*(n—8,)"* XY cov ($(X; — Xj 45,)*, HX; — 2j..5,)") 


t= j=1 


1 
= (n—8,)~! (n—8,)—! ¥ 2[cov ($(2; — X;.9,)- (x; = Xj+9,))? 
i,j 
n—-8,-1 


= (n—8,)-!(n —8,)— >» (n—c,).4[ —d(a) —d(a +8, —8,) + d(x —8,) + d(x +8,)]? 


a=—(n—s,—1) 
n—8—1 n(n —C,) 
=n! > ——_— .4{ —d(x) —d(a+8,—s8,)+6(x—8,)+d(~a+8,)]*. (36) 
a=—(n—s,—1) (NM — 8) (M—8,) ~ Bit . 2)] ( 
where (ae rst: 8,— 8, <a<n—s,—1. 
Cy = 18, 0<a< 8-8, (37) 
8+|a|. —(n—s,—1)<a<0. 





The dominant terms in the summation in (36) are again those for small | s |. As | s | increases, 
and d(s) straightens out. the remaining terms, being proportional to squares of a kind of 
second difference, again tend rapidly to zero. 


Example 3. Spatial systematic sampling 


The systematic sampling will be of the variable x(u, v) which is defined over the rectangular 
region 0<u<r,, 0<v<r,, samples being taken at the points (i — 4, j— 4) for integral i, 7 
such that i = I,....7,,j = 1,...,1». , ' 

Let L,; = xi-4,j-})- [" [ z(u,v) dudv. (38) 

v 


=j-l1 


/ u=i-1 





ee eee 








ur 


8) 


T_T ge, re ent 





cov (Li; Lyy) = -—d(i-7',j -j')- 
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Then 
i-'+1 0 pi-s’+1 skal TT 
dip gto l8-i +8 D1 8-547" Dae, o)doag 
Cea d'— 1S Gf =F A 
— ‘tay? 
+f i-t—u,j—}—v)dudo+ | " 8(u—i'—},0—j’—})dudv. (39) 
v-1J9’-1 i-1Jj-1 
This covariance is itself a s.1.1.f. of 6(0,¢) over the region 
i-—i'—-1L<@<i-i’4+1, j-j’-l<¢<j-j' +l], (40) 
and will be negligible if the points (i, j), (¢’, j’) are far enough apart. 


The sampling variance of the mean U of a systematic sample of n ( = r,7r,) about the mean 
of the rectangular region is therefore given by 


var U = var (n= S 5 Ly) (41) 
i=1j=1 
=n ¥ cov (Lis, Ley) =(n-) ¥ cis’, j—j"), say. (42) 
iji’j’'=1 ij'7’ 


In evaluating the summation in (42), we take the terms in order of increasing distance 
between (7,7), (i’,7’). Hence 
2n 


ae, 
var U = (00,0) + [45 e(O+, (0, 1) e(1,1)...). (48) 


2n 4n 
ah) + aT 
As the distance increases and 4(0,¢) becomes locally linear, the terms will become 
negligible. 


In Ex. 1, 2 and 3 we are dealing with second moments, corresponding to r = 2 in (30). 
Hence we verify that they are all of order n-#” =n-", provided that local linearity is attained 
rapidly enough. The next example is different, involving a greater value of r. 


Example 4. Fourth moment of the statistic U =(n-) {(x, —x_) + (%3 —%q) + ... + (an—1 —Zen)} 


The statistic U is identical with that defined in (1). In the expansion of U*, the expectation 
of a typical term is given by 


E(o;_y — Xa) (Loge — Voir) (Wage — Vayr) (Wogr_y — Loi) 





= A"3(2i—i’) A"8(2%” — i”) + A"8(2t — 1”) A"8(21’ — i”) + A"8(2i — i”) A"8(2t" — 7”) 

=U ips Say. (44) 
The value of this depends only on the configuration of i,i’,i”,i”. If these are placed in 
increasing order of magnitude, we may denote the central interval by q, and the other two 
by p and r, where p<r. The number of terms giving the same values of p, q and r, 


and hence the same expectations, is equal to (n—p—q-—r) multiplied by the factor a,,, 
given below: 


Range of p, q, r Boer 

O0<p<r; q>0 2.4! 
0<p=r; q>0 4! 

O=p<r; g>0 2.41/2! 
O0=p=r; qg>0 41/(2!)? 
O0<p<r; q=0 2.41/2! 
O0<p=r; q=0 41/2! 
0=p<r; q=0 2.41/3! 


O0=p=qg=r 1 
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If we write the total of these terms in each case as (n — p—q—1r) B,,,, we obtain the following 
result: 


E(U*) =n Wiping” 
4,7,0,t°=1 
=n >> (n—p—q-r) Bog 
P,a,7r 
(par, p+qt+r<n-1) 
n—1 min (r,n—r—1)n—p-r 
=n*> >> x (n—p—q—r) By. (45) 
r=0 p=0 q=0 


If for s > &, 6(2s) differs from linearity over the interval (2s— 1, 2s+1) by less than some 
small quantity ¢, we have | A"3(28) | < 4e. (46) 
Since Boor © A"3(2p) A"8(2r) + A"S(2p + q) A"d(2q +r) + A"d(2p+q+r)A"(2q), (47) 


Bygr Will be of magnitude O(1) if both p and r are less than s9, of magnitude O(e) if just one 
of p, g, r is less than 8, and of magnitude O(e?) if none is less thar 89. 
The summation in (45) may be separated into parts having different orders of magnitude. 
If 3<n, 
8-1 r n-r—p-1 
E(U*) =n“ = >» >» (n—p—q—Tr) Bug 
r=0p=0 q=0 


n—1 &—1 n—r—p-1 


+n x = D> (n—p—q-—r) Buy 


T=8 p=0 q=0 


n—-1 min(r,n—r—1) min(s,—1,n—r—p—1) 


+n* > p> >» (n—p—q—T) Boy 
r= 8 p= q=0 
n-1 min(r,n—r—1) n—r—p—-1 

+n* > >> > (n—p—q—T1) Bog (48) 
T=8, p=8& q= min (s.—1, n—r—p—1) 


The four terms on the right-hand side of (48) are of magnitudes respectively 
O(n-*e®), O(n-e"), O(n—e4) and O(n%?). 
Hence if € = o(n-) (49) 


the right-hand side of (48) will be dominated by the first term. To order n-*, this may be 
put in a more readily computable form: 


E(U)~n-2'S 5 ja,,,A"8(2p) A"8(2r), (50) 
r=0 p=0 
since Sy (n—p—q—1) = Mn—r—p) (n—1—p—1)~n, (51) 
q=1 


and dy, = ;, for q> 0. To this order, we may neglect the term in ayo,. 

This example has been chosen because of its comparative simplicity, to illustrate the way 
in which the summations in these problems have to be arranged in order to take the im- 
portant terms first. It is a fourth moment, corresponding to r = 4 in (30), and we have 
verified that it is of order n-* = n-*, provided that linearity in 4(s) is attained rapidly 
enough. It might be required in allowing for the effect of terms of order n-* in the test of 








) 
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significance of the treatment effect in (5) (cf. Small, 1954), or in obtaining control chart 
limits correct to order n~! for the difference between means of interpenetrating systematic 
samples from, say, a conveyor belt. 


REFERENCES 
Baxt ett, M. 8. (1946). Some aspects of the time-correlation problem in regard to tests of significance. 
J.R. Statist. Soc. 98, 536. 


Jowett, G. H. (1952). The accuracy of systematic sampling from conveyor belts. Appl. Statist. 1, 50. 

Jowett, G. H. (1953). The comparison of means for industrial time series. Paper given to Royal 
Statistical Society conference. 

Jowett, G. H. (1955a). Least squares regression analysis for trend-reduced time series. J.R. Statist. 
Soc. B (in the Press). 

Jowett, G. H. (19556). The comparison of means of industrial time series. Appl. Statist. 4 (in the 
Press). 

Sma, V. J. (1954). M.Sc. Thesis for the University of Sheffields 








[ 170 ] 


MODELS FOR TWO-DIMENSIONAL STATIONARY 
STOCHASTIC PROCESSES 


By V. HEINE 


Applied Mathematics Laboratory, Department of Scientific and Industrial Research, 
Wellington, N.Z.* 

To aid the analysis of two-dimensional stationary processes, three different models are considered, 

derived from the second-order stochastic partial differential equation. Their correlation functions are 

calculated, for fitting data in the form of a correlogram to the model. The corresponding Green 


functions describe the physical nature of the models, and in particular distinguish between space-like 
and time-like axes. 


1. INTRODUCTION 


The occurrence of stationary stochastic processes in two dimensions is well known. In 
analysing such data, it is at least useful, and occasionally of theoretical importance, to fit 
the correlogram to a particular plausible model and to estimate certain parameters in the 
model accordingly. However, owing to mathematical complexities, only the correlation 
functions generated by the very simplest models have been investigated in the past, 
particularly by Whittle (1954), and it would seem desirable to extend the range of available 
models. 

We shall consider the general second-order linear stochastic partial differential equation 

2 2 2 
(0+ ths tb + Be +S +e) Elan) = ety), (1-1) 
where £ and «¢ are the variate and the random impulses effecting it respectively, both with 
means zero. This leads to three types of model, corresponding to parabolic, elliptic and 
hyperbolic forms, whose correlation and Green functions we shall derive. 

In two-dimensional processes, one is necessarily concerned with the difference between 
time-like and space-like axes. In a time series, the variate can only be influenced by past 
events; and accordingly, a time-like axis is one, such that a random impulse can only pro- 
duce an effect in one direction along it. Besides time itself, examples are the distance down 
a steep hillside, and the distance in the direction of a strong wind scattering seeds. Along 
a space-like axis, the variate depends on events in both directions. We discuss these features 
for each model in terms of the Green function, which represents the effect at one point, of 
a random impulse at another point. In fact, the whole nature of the process is best visualized 


by considering the Green function, which thus shows what kind of physical processes the 
models may represent. 


2. PRELIMINARY THEORY 


The results of §§ 2 and 3 are analogous to those relating to one-dimensional time series, and 


we present them here but briefly in the particular form required later (cf. Bartlett (1946) 
and Daniell (1946)). 


Consider the stochastic partial differential equation 


Can) 
(5. x) E(e,y) = €(2,y). (2-1) 


* Now at Sidney Sussex College, Cambridge. 





2-1) 


V. Heine 171 


Assuming the validity of inverting the order of differentiation and integration, the solution 
may be written formally 


&(x,y) = [Pf ee y eeu, 0) dud, (2-2) 


where the Green function (?#(x, y) satisfies 


fe 
aba x) G(x, y) = B(x) aly), (2-3) 


using the Dirac delta function (cf. Van der Pol & Bremner, pp. 75, 315). The physical 
interpretation of (2-2) and (2-3) is that G(x — wu, y — v) represents the effect at the point (z, y) 
of a unit impulse at the point (w, v). 


If the e(u,v) are entirely uncorrelated random impulses, we have for the covariance 


functions cov, (x, y) = expectation of [e(u, v) E(u —x,v—y)] 


= 0°8(z) 8(y), (2-4) 

and cov; (x,y) = af | G(u,v) Gu —x,v—y)dudv (2-5) 
by using (2-2) and (2-4). In what follows, it will be convenient to calculate the function 

R(x, y) = [of 6am G(u—z,v—y)dudr. (2-6) 


which can then be normalized right at the end, to give the correlation function of £ in the form 
p(x, y) = R(x, y)/R(0, 0). (2-7) 
Further, G and R must tend to zero at infinity, and R must be finite everywhere. 


3. Usk oF THE LAPLACE TRANSFORM 


The manipulative complexities, encountered in calculating the Green and correlation 
functions from (2-3), (2-6) and (2-7), are handled using the two-sided Laplace transform. 
We follow completely the notation and dictionary of formulae in Van der Pol & Bremner 
(1950), reference to whom will be made simply by the page numbers. Thus if f(p,q) is the 
transform of h(x, y) we write (pp. 18, 334-6) 


h(x, y)==f(P,9), 
where f(a.” - pa {" e—Pr—W h(x, y) dxdy. 


If 9( p,q) be written for the transform of G(x, y), (2-3) gives (pp. 48, 345, 384) 


L(p,9)9(P.9) = P94, 

, ‘ PQ 
e. G(x, y) =9(p,9) = ———.- 31 
1.€ (2, y)==9(P,9) " Lip.@) (3-1) 
The transform of (2-6) may be written down using (3-1) and the composition product 

formula (pp. 39, 382); thus 
 AP,DIH—P, —9) Pq 

R t,y)= = ° 
i Pq L( p,q) L(—p, —49) 





(3-2) 
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4, STANDARD FORMS 


By the simple substitutions x’ = —xz, kx, + y, xcos@+ysin@ and in the hyperbolic case by 
x’ =2x+c,y,y' = £+C,y; the general form (1-1) can always be reduced to one of the following 
standard forms: 


parabolic (= + a) _- at + ’) : (4:1) 
= 0 2 @g2 

elliptic (= a 2) + ay +y?; (4-2) 

hyperbolic (=<; + 2) (5, + A) cs y?; (4:3) 

degenerate y Ls +a o +2; (4-4) 
© Oy? Oy 


where a, f, y are all real and positive, or zero. 


5. THE PARABOLIC FORM 


; an) r) gia 5 
‘ it.) eae | iat a oe 5-1 
Consider L(x, =) (= +4) y (= +4). (5-1) 
Fr 3-1 G(x, y)= ——= , 5-2 
om (3:1) 2.9) oh aA (5-2) 
hence ert+hu (x, y)2=-—!4 __ (p.374) 
edt & | 
= of exp (EH) Uy) (p. 26), 
hence G(x, y) = Peart. exp (- ax — By — =) U(y) (p. 21), (5-3) 
2y v(7y) dy 
where U(y)=1 (y29); (5-4) 
=0 (y<0). 


The case of the lower sign in (4-1) is covered by considering f£ as negative in (5-1), which 
makes G in (5-3) non-vanishing at infinity, and is thus inadmissible. 

Equation (5-3) shows that an impulse at the origin only has an effect at positive values 
of y. Thus the y-axis is a time-like axis, whereas the x-axis is space-like. The Green function 
has the shape of a Gaussian error curve in the x-direction, with ever-increasing variance and 
decreasing amplitude as y increases. Hence the y-axis may well represent distance downhill, 
downstream or along the direction of a wind in any kind of diffusion phenomenon. Other- 
wise it may represent time, during which something spreads out along a line, for instance, 
the descendants of a plant. 


From (3-2), 
; (¢ me e+ a -(0- - =P) 
| ie 


YiR(x, y) = 
P si 24 42 — y2 
= hexp (=P Sey) map. (7B - = P) ke > YA) | y | 








| (p. 27). (55) 


3) 


4) 


1es 
ion 
nd 
ill, 
er- 
ce, 
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Without loss of generality, consider the two quadrants with y > 0. Thus rearranging, 


Re. ns aem| 2. (p+ ayexp| P*ShHl| | als (5-6) 











2" y* (y2B — a?) —p*| p 
(p+a)*y er wry2 : 
Now oe g(p+a)exp| P a Yl % ean xp(-7) (p. 21), (5-7) 
1 
and a a yp — aly OXP (— V(y*B—a?)|x|) (p. 27). (5:8) 


We can now use the composition product rule (p. 39), also (5-7) and (5-8), in (5-6): 


R(x, y) = (constant) | exp (- an TO V(y?B—a?*) |x—7 ) dr. (5-9) 


—-@ 


The integral in (5-9) may be simplified by breaking it up into two integrals over the ranges 
7<2x and 7>z2 respectively, and then using linear substitutions 


-Ft +a)? lly ( “7 


Using (2-7) to insert the appropriate normalizing constant, we finally have 





p(z,y) = p(—2x,—y), where y>0O, 


e-24B (A-B 
= | edt 








1 —-w 
et24B ‘co tg 
+f e* dt. (5-10) 
where Aw («+—?), B= ly (2-5)}. 
2Jy 
with 0<a%< fy? [see (5-13)]. 


As emphasized by Van der Pol & Bremner (1950, pp. 19, 27), to ensure the validity of 
the above argument, it is necessary to show that (5-2)-(5-9) possess a common non-vanishing 
region of convergence. This requirement eliminates several other solutions of (5-2) and (5-5) 
which would otherwise appear to be formally possible, and also imposes some sufficient 
conditions on a, # and y. By considering (5-1), (2-3) and (2-6) directly, without the use of 
the Laplace transform, these conditions may be shown to be also necessary. 

Thus (5-3) requires (pp. 26, 374) 


Re| -f+ (P=) | <Req<o. (5-11) 
(5-5) in addition requires (p. 378) 
cud 2 
~c <Req< -Re| ~A+(2—*) |. (5°12) 


For these regions to overlap, we must have the left-hand expression in (5-11) less than the 
right-hand expression in (5-12), which leads to 


0<at< py. (5-13) 


It may be shown that if and only if (5-13) holds, do (5-2) to (5-9) have a non-vanishing 
common region of convergence. 
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6. ELLIPTIC FORMS 
The special form 


d 2a" eo @ 
(REA ee ee 5 
U5. =) dat t Bya Y (6-1) 
has been treated by Whittle (1954), who used it in discussing the yields of orange trees in 
a square array. For the sake of completeness, we repeat 


1 
G(x,y) = 5, Klnl== 5 (pp. 357, 407). (6-2) 


oT 
2 + ¢ —_ y? 

R(x, y) is obtained by differentiating (6-2) with respect to y (p. 373). Then from (2-7), 
P(x, y) = (yr) Ky(yr). (6-3) 


Here and elsewhere, r = ,/(x?+y*), and the K’s are the modified Bessel functions of the 
second kind. Both the x- and y-axes are space-like, and for (6-1) the Green and correlation 
functions decrease monotonically equally in all directions. A simple change of scale in the 
x- or y-direction introduces unequal degrees of correlation along these axes. This model 
would be applicable to a wide variety of circumstances, such as field trials on flat land. 

A form involving a further parameter is 


a a a. ee 
55> ay) = (e-2) tap-7* vn 


From (3-1), (6-2) and pp. 357, 339, 


e . ¥ Pq 
( = — == . 
G(x, y) on K,(yr) “(p—a)?+q—y? 





(6-5) 


This Green function is the same as (6-2), except that the term exp (az) increases the influence 
an impulse has in the positive x-direction at the expense of the negative x-direction, and 
thus may represent the effect of a wind or a slight slope of the ground. From (3-2), 


fe Hof aug ier) od ct 
Regs | aa ara el: 





Using (6-5), the integration rule (p. 51) and (2-7) 


plx,y) = A= 2) (™ sinh (cer) Koly v(r2+y2)) dr (6-6) 
7 sin (a/y) J» . : 
where the normalizing constant follows from a formula in Watson (1944) on p. 388. Con- 
vergence of (6-6) requires a<y. 
All other elliptic forms are inadmissible. Consider 
ea oe oe ge 
he) a a re ’ 
Uz, | (= 2) +5aty ’ (6-7) 
If a = y = 0, the Green function is logr which does not vanish at infinity. If a = 0, y +0, 
the Green function is }Y,(yr) (p. 357), and the integral in (2-6) does not converge. If a +0, 
y +0, the Green function } exp (ax) Y,(yr) does not vanish at infinity. 


ce 


6) 


on- 


7) 


- 0, 
+ 0, 
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7. HYPERBOLIC FORMS 


, 0 0 C) r) " 
Consider 7 ae 5) = (= +2) (5, +4) +7". (7-1) 
From (3-1) and p. 346, and using the function U defined in (5-4), 


G(x, y) = ea Pu Jo(2y «/(xy)) U(x) U(y) 


a Pq 
“(p+a)(q+h)+y7? 


is the only Green function that vanishes at infinity. Its form shows that any impulse can only 


have an influence in the positive x- and y-directions, which therefore represent two time-like 
axes. 





(7-2) 








y 
uN Q 
P (X,Y) 
4 
B 
8 - x >x 
Aq 
tan @=a/B 














Fig. 1. Path of integration for (7-7). 
From (3-2), 


a ee > Ney Pq - 
Ais,9) Sagi mnis sche ae (p—a)(q—f)+y?]’ (7:2) 


with region of convergence | Rep| <a, | Req|</. If we put 





f(z, y) = e-**-PY Jo(2y /(xy)) U(x) U(y) — er*+6u Jy(2y y(xy)) U( —x) U(—y) 


=| eae <a |. (7-4) 
(p+a)(q+fh)+y? (p—a«)(q-A)+y? 

hen (7-3) bec P F) . 

ee TT OT 2(B5. +05) Rea.) = —fleiy). (75) 


Hence R(X, Y) is obtained by integrating f(z, y) ‘of (7-4) in the ¢ direction, i.e. along PQ 
(Fig. 1), making an angle tan («/f) with the z-axis: 


1 


R(X, Y) = 2 J(at+ BF)! pox, yf ore (7-6) 
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In the first quadrant, f(x,y) is equal to the first term in (7-4), and 


(X,¥) = ee [1+ Z) fo ee ereeley Meena (77) 
vay v(a? + A?) ap} J px, ¥) , 

where the normalizing constant is obtained from a formula on p. 384 in Watson (1944). 
In the fourth quadrant, f(x,y) is zero, so that p is constant along any line AB (Fig. 1), and 
is equal to p(B) as given by (7-7). In the third and fourth quadrants, p is obtained from the 
relation p(X, Y) = p(—X, — Y). It is necessary for the convergence of (7-7) that a>0 
and #>0. In evaluating (7-7) numerically, it would be helpful to employ a change of scale 
to make « and f each 1/,/2. 


The form uz, =| = (< +2} (+8) (7:8) 


may be considered as a special case of (7-1) with y = 0, but it is easier to start again from first 
principles. From (3-1) and p. 386, 


G(x, y) = e~**-4u U(x) U bene Sel 2 7-9 
(x,y) (UW) == aah (7-9) 
From (3-2), (2-7) and p. 386, 
p(x,y) = exp(—a|z|—A|y|) (7-10) 
over all four quadrants. 
Considering the form 
0 @ ) 0 
De Bi ~~ —y2 , 
us. sa) (5,+2) (5, +8) he (4d) 


the above analysis (7:1) to (7-7) applies, except that J, must now be replaced by J), and the 
sign of y? changed. In this case it is necessary that y* < af for convergence. 


8. DEGENERATE CASES 


0d 0a oO é 
Ss Nana) = Va a+ , 
Degenerate forms L an =) Y Bye +a By +f (8-1) 


are strictly speaking inadmissible. For from (3-2), 
R(x, y) = 3(x) Ryly), (8-2) 


where F,(y) is some function of y; and thus R(x, y) cannot be normalized in the sense of 
(2-7), because of the delta function. 

However, a case may well arise in practice, where the correlogram is approximately zero 
everywhere except along one direction, say the y-axis. It is therefore necessary to consider 
forms arbitrarily close to (8-1); or what is the same thing, to consider (8-1) as the result of 
some limiting process. 

The important fact is that the result depends on the particular limiting process used. 
How this is possible may be seen as follows. Consider the rather restricted form 


Cn?) bn acid 
Uns.) with limit (8-1) as 7>0. (8-3) 


The correlation function may always be written in the form 


p(x,y) = p(0,y) N(x/7,y), (8-4) 


the 


3-1) 


3-2) 
> of 
ero 
der 
t of 


ed. 


8-3) 


8-4) 
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where N(x/n,y)=1 for x=0, 
=small for zx>y7, since p>0 as x/y7>00. 


In the limit, the form (8-1) has the correlation function 


p(x, y) = Pxly) N(z), (8-5) 
where the null function 
N(z)=1 (x=0); 
=0 (x+0). 


As there are many different forms satisfying (8-3), so there are different correlation functions 
(8-4) and (8-5). We give the following example. 


Limiting form: By +f. 


I form: (n5+1) (5+8). 


Limiting correlation function, from (7-10): 


pla,y) = e-Fl¥| N(2). 





II form: (- ae +(5 +8) 
: T oa oy ? 
Limiting correlation function, from (5-1): 
2N (zx) [ ? a 
x,y) = edt 
p(x,y) vm J vsiy) 


Corresponding results for other more general forms of the type (8-1) may be easily 
obtained. 


9. SUMMARY OF ALLOWED MODELS 


The allowed standard forms (cf. § 4) are as follows: 
Parabolic; x-axis space-like, y-axis time-like: 


0<ar< py; 


Green function (5-3), correlation function (5-10): 
Elliptic; z- and y-axes space-like: 


O0<a<y; 


Green function (6-5), correlation function (6-6): 
Hyperbolic; x- and y-axes time-like: 


(+2) (5 +6 +7 
Ox dy ) i 
a>0, B>0, y*?20; 


12 Biom, 42 
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Green function (7-2), correlation function (7-7): also 


é é " 
(+4) (5,+4)-7 
a>0, B>0, I0<y?<af; 


for Green and correlation functions, see above (7-11). 


The problem was suggested by Dr P. Whittle, to whom I am also indebted for helpful 
discussion. 
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SOME PROBLEMS IN THE THEORY OF PROVISIONING 
AND OF DAMS 


By J. GANI 
Australian National University, Canberra, A.C.T. 


The paper begins with a review of two problems in the theory of provisioning with a discrete stock 
considered by Pitt (1946), and a short account of Moran’s work (1954) in the theory of finite dams. 
It is pointed out that these provide two different methods of attack, each appropriate to certain 
conditions, on problems in the probability theory of a general storage function S(t) defined at time t by 
S(t) = I(t)— D(t)— Fit), 

where I(t), D(t), F(t) are respectively an input, output and overilow function. This storage function is 
identified with the stock deficit in provisioning theory, or the dam content in dam theory, so that any 
problem and its solution in the one theory has an exact anslogue in the other. 

The paper continues with the use of Pitt’s results of the theory of provisioning in two analogous 
cases of the infinite discrete dam. This is followed by the application of Moran’s methods of the theory 
of dams in some analogous problems of provisioning with a discrete finite stock; exact solutions are 
obtained for the discrete and continuous cases of a particular problem in which ordering and replace- 
ment times coincide. 

The paper closes with an exact solution of the gerieral storage problem in the case of a finite con- 
tinuous storage function S(t), fed by a discrete input function of Poisson type, with a continuous 
output which has a steady rate when S(t)+0, and is zero if S(t) = 0, and an overflow function such 
that S(t) never exceeds a prescribed value. 


I. PROBLEMS IN THE THEORY OF PROVISIONING 


In his work on the theory of provisioning, Pitt (1946) is concerned with the derivation of 
stationary probability distributions for a discrete stock function s(t). This stock function 
represents the number of components in stock at a time ¢, and is allowed to take integral 
values ranging between K and — oo, where K = s(0) is the initial stock, and negative values 
indicate that components may be borrowed if they are not in stock. The function is defined 
for all values of time ¢ (¢ continuous) by the equation 
s(t) = K+r(t)—c(t), (1) 
where c(t) is the consumption function, the number of components consumed up to time ¢, 
a random increasing step function taking positive integral values, and r(t) is the replacement 
function, the number of components delivered and added to the stock up to time f, a function 
depending on c(t) and always smaller than or equal to it. 
The probability distribution associated with the consumption function c(t) is the Poisson 
with parameter a, so that the probabilities that in a small interval of time dt there be one 


and no components consumed respectively are 
Pr {c(t + dt) —c(t) = 1} = adt, | 2) 
Pr {c(t + dt) — c(t) = 0} = 1—adt, 


and increases in c(t) in non-overlapping intervals are independent. Two types of replacement 
functions are considered: , 
Type 1. r,(t) = [c(t—T) MM] M, (3) 


where we define [z] as the integral part of x; here, orders for a constant number of com- 
ponents M are sent out at irregular time intervals ¢— 7’, when c(t — 7) is an integral multiple 
of M, and are delivered and added to the stock at time t, after a positive time-lag 7’; 


I2-2 
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Type 2. ro(t) = e{[((¢- 7) aM] Ma-}}; (4) 


here, orders for an irregular number of components equal to the consumption in the previous 
time interval Ma- are sent out at regular times kMa- (k = 1, 2, ...), and are delivered and 
added to stock at times kMa-!+T after a positive time-lag 7’. 

For both these types of replacement function, the stationary probability distributions 
of s(t) are derived by arguments which will be followed exactly in §IV of the paper. Using 
these probability distributions, Pitt concludes with a comparison of the means of both 
the positive and negative values of the stock function s(t) when the replacement functions 
are of types 1 and 2. 


II. PROBLEMS IN THE THEORY OF DAMS 


The problem discussed by Moran (1954), that of obtaining the stationary probability dis- 
tribution of the amount of water in a dam, appears at first sight to be of a different nature. 
A finite dam of capacity K units, whose content at the times ¢t = 0,1, 2,..., after water has 
been released according to a prescribed rule, is Z(t), where Z(t) will be found to lie in the 
range (0, K — M), is subject to the following conditions: 

(1) a discrete input X(t) which flows into the dam during the interval of time (¢,¢+ 1), 
where the series {X(t)} is serially independent, and the probability that the input be 7 units 
(¢ = 0,1, 2, ...) is p,; . 

(2) an overflow rule such that in any time interval (¢,¢+ 1) there is no overflow from the 
dam if Z(t) + X(t)< K, and there is an overflow Z(t) + X(t)—K if Z(t) + X(t)>K; 

(3) a release rule such that at time ¢+ 1, M units are released from the dam if its total 
content Z(t)+ X(t) > M, and the total content Z(t) + X(t) is released if this is less than M. 

Then, provided K —M > UM, a system of equations is derived relating the probabilities 
Py, P,,.-., Pry that Z(t) be equal to 0,1,...,.K—M at time t, with the probabilities 
PP, P?,..., PR_y that Z(t+ 1) be equal to 0,1,...,K —M at time t+ 1. These are: 


P}> = Pi pot pyt--- + Paz) +Pi( Pot ---+Py-1)+---+PuPo 
PY = Pypurss +P, py +... +Prpii Po: 
ssa haiibieSbodmentippinlien cushsaigcdedioettaditdam beanies cabal coe (5) 
PP_w-1: = Pode, Sy Sa annette to icy ee alte) 
PR_y =Podx +Pidx-a +...+Pr_u dup 


« 
where, for convenience, we have writteng; = > p;. Written in matrix form, these equations 
i=j 


are P® = pP, where P, P® are column vectors with elements P;, P? respectively, and p is 
the matrix of coefficients. 
The stationary distribution {P;} is required, and this is obtained by writing P® = P in the 
K-M 
set of equations (5) and solving P = pP together with the additional condition } P; = 1. 
i=0 


t 
It is pointed out that although no solution in an explicit form may exist in general, the 


values of the stationary probabilities P; can always be evaluated numerically for any known 
matrix of coefficients p, and these are in fact computed for a special case. An extension of 
this discrete theory also allows equations for the continuous stationary distribution of 
U(t) = Z(t)+ X(t) to be written when Z(t), X(t) are continuous and U/(t) lies in the range 
(0, 0). 
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III. PRovIsIONING AND DAM PROBLEMS AS GENERAL STORAGE PROBLEMS 


We now proceed to identify these problems of the stock function s(t) of provisioning, and 
of the dam content Z(t) in dam theory as particular problems of a general storage function 
S(t). This may be discrete or continuous, and defined for continuous time ¢ or at fixed times 
such as ¢ = 0,1, 2, ..., according to the conditions of the problem, by the equation 


S(t) = I(t) — D(t) — F(t), (6) 


where I(¢), D(t), F(t) are positive increasing functions, discrete or continuous. I(t) is a 
random input function which feeds the storage function, D(t) is an output function which 
depletes the storage function and may depend on J(¢) or be otherwise defined, and F(t) is 
an overflow function which operates to restrict the range of S(t) below a prescribed maxi- 
mum value. 

To show that the problems of Pitt’s stock function s(t) of provisioning can be regarded as 
problems of a particular storage function S(t), we consider the previously defined stock 


function (1), a(t) = K+r(t)—e(t), 


which lies in the range —oo<<s(t)<K. We note that this equation holds at any time ¢ 
(¢ continuous), and that c(t), the random discrete consumption function, and r(t), the 
discrete replacement function (r(t)<c(t)), are defined as in §I. We can here equally well 
consider the stock deficit, K —at) melt) —#10) (7) 


which ranges from zero when the stock is at its maximum K, to 00 as the stock decreases 
through 0 to —oo. Naturally this is possible only when the stock has a maximum value K 
which it cannot exceed. Comparing this with a storage function 


S(t) = I(t) — Dit), (8) 


for which F(t) = 0 for all ¢, itis clear that a discrete storage function S(t) defined in the range 
(0,00) for all time ¢ (¢ continuous), is identifiable with the deficit K — s(t). Similarly, I(t). 
a discrete random input function, is identifiable with the discrete random consumption 
function c(t), and D(t), a discrete output function (D(t) <J(t)), with a discrete replacement 
function r(t). F(t) will be zero at all times, since the range of the stock deficit is (0,00), and 
no overflow is required to restrict this range below any prescribed value. 

In the specific problems considered by Pitt, where the consumption function c(t) was of 
Poisson type with parameter a, and the replacement functions of the types previously 
defined in (3), (4), as r,(¢) and r,(t), the general storage equation would be such that 


S(t) = K—<s(t), I(t)=c(t), D(t)=r,(t) or 1r,(t), F(t) =0. 


However, generally, for any stock function s(t), whether continuous or discrete, defined for 
continuous time ¢ or at discrete intervals of time, providing only that the maximum stock 
K is finite, it is always possible to frame the general storage equation (6) for the stock deficit. 
The finite maximum stock K results in the storage function S(t) having a range with lower 
bound 0; its upper bound may be infinite, as in the case just considered, or finite as we shall 
see in§ V. I(t), D(t) and F(t) will be appropriately chosen to fit the conditions of the problem. 

Similarly, we proceed to show that Moran’s discrete dam content Z(t) with range (0, K — M) 
can be regarded as a particular storage function. In this case, however, the function Z(t) 
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is defined only at the times ¢t = 0, 1, 2,..., and we shall also define the storage function at 
these times. In our storage equation (6), we may identify S(t) with Z(t) — Z(0), the increase 
in dam content after a time t; if, without loss in generality, we assume that Z(0) = 0, that is, 
the dam is empty at time ¢ = 0, we have that S(t) the storage function is equal to the dam 
content Z(t). The random discrete input function J(t) would be the sum of the discrete 
random inputs X(0), X(1), ..., X(¢— 1) in the time intervals (0, 1), (1, 2), ..., (¢— 1, #), so that 
t 
I(t) = ¥ X(r-1). 
T=1 
The discrete output function D(t) would be the sum of discrete outputs released at times 
T = 1,2,...,t; let these be written d(1),d(2), ...,d(t), then we have that 


ad(r)=M if Z(r—1)+X(7r-1) 2M, 


or d(r) = Z(r—1)+ X(r-1) if this is less than MV, 
30 that the output function is 
t 
Dit) = ¥ d(r). 
T=1 
Finally, the overflow function F(t) would be the sum of the discrete overflows in the time 
intervals (0,1), ... (¢—1,t); let these be written f(0), ...,f(t— 1), then we have that 


fir-l =0 if Zr—-1)+X(7-1) < K, 
or f(r—1) = A(r-—1)+ X(r—-1)—K if this is greater than zero, 


so that the overflow function is 


The equation for Z(t) at times t = 0,1, 2,... could be written 


A(t) = ¥ X(@r—1)- ¥ d(r)- ¥ fir-1), (9) 
T=] 


— 
T=1 T=1 


a particular case of the storage function S(t). For a continuous dam content, a similar 
equation could also be written. More generally, for a Z(t) continuous or discrete, defined 
at any time ¢ (¢ continuous) or at discrete times, with a finite or infinite range, it is possible 
to frame a general storage equation. 

In neither the case of problems of provisioning nor of dam problems will the storage 
equation itself necessarily provide a method of solution enabling the stationary probability 
distribution of the function to be found. But as we are always able to reduce provisioning 
or dam problems to a general storage problem, this means that a provisioning problem can 
be interpreted equally well as a dam problem, and vice versa. In other words, any problem 
and its solution in the one theory will have an exact analogue in the other; equally, any 
method of attack on a problem in the one theory may be useful in analogous problems of 
the other. 

We proceed to interpret Pitt’s results in provisioning theory as solutions in two analogous 
cases of the infinite dam. We shall also apply Moran’s methods in the theory of dams to 
some problems in the theory of provisioning. 
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IV. INTERPRETATION OF PITT’S RESULTS IN THE CASE OF THE INFINITE DAM 
Pitt’s equation for the stock deficit 


K—a(t) = c(t)—r(t) (r(t) <e(t)), 
was seen to be an example of the storage equation 
S(t) = 1()— Dit), 


where at any time ¢ (¢ continuous), S(t) is discrete and lies in the range (0, 00), I(t) isa random 
discrete input function, and D(t) is a discrete output function depending on I(t), (D(t) < I(t)). 
Equally, a storage equation of this form can represent the discrete dam content Z(t) 
(S(t) = Z(t)), at any time t, where the dam is infinite (0 < Z(t) < 00). In this case, interpreting 
the specific conditions of Pitt’s problems of provisioning for an infinite dam, we have that 
the discrete input function I(t) will be of Poisson type with parameter a such that, in a small 
interval of time dt, the probabilities that one and no units of water flow into the dam are 


prey Pr {I(t + 6t) —I(t)=1} = adt, 
Pr{I(t+ dt) —(t)=0} = 1 oaiad 


and increases in /(¢t) in non-overlapping intervais are independent. The release rules corre- 
sponding to Pitt’s two replacement functions are such that the resulting output functions 
will be: 

Type 1. D,(t) = Ut¢-T) UM MW ;; (11) 


here, releases of M units are made at irregular time intervals t, after a time-lag T' from the 
times t— 7' when J(t—7') is an integral multiple of M. In practice, this might mean that 
at every time when the input increased by M units, a decision would be taken to release 
M units from the dam, but this decision would be carried out after a time-lag 7’; 


Type 2. Dt) = [{{(t-T)aM-] Ma}; (12) 


here, releases of an irregular number of units equal to the input in the previous time interval 
((k—1) Ma, kMa-") are made at regular times kMa!+T (k = 1, 2,...), with time-lag T 
after the interval]. in practice, this might mean that at the regular times kMa-, a decision 
would be taken to release the input in the previous interval Ma-', but this decision would 
be carried out after a time-lag 7’. 

We proceed to follow Pitt exactly in deriving the stationary distributions of the dam 
content Z(t) when the release rules of types 1 and 2 are functioning, and in comparing the 
mean contents of the dam for these cases. 


(10) 


(1) The stationary distribution of Z(t) for release rule of type 1 


In this case, where the output function is given by (11), the storage equation for the dam 
content can be written 


Z(t) =1(t)-Wt-T)M-M  (0< Z(t)<o). 
Let the input function at times ¢, and t—7' be such that 


(t-T) =i and IJ(t)—I(t-T) =), 
where i and j can take the values 0, 1, 2,.... Then, writing 


P,(t) = Pr{Z(t)=n} (n= 0,1,2,...), 
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for the probability that the dam content be n units at time t, we have that 
P,(t) = Pr{i+j—[¢:M-] M = n}. 

Now suppose that kM <i<(k+1)M-1 (k=0,1.,...), 


then it follows that 
fM4])=k, t=kM+v and j=n-2, 


where v can take the values 0,1, ..., M@—1. From the independence condition for increases 
in I(t) in non-overlapping intervals, it follows that 


F(t) = Fa il ap sik PR ic oa 


M-1 © n {a(t— T)}* M+v 





- ee diel bn ent - iil on 13 

er 24° (kM +)! ioe (13) 

r 

where we write G-= eet ert for r2>0, 
and g, = for r<0. 
The stationary distribution is obtained when to; we shall write it P, = lim P(t). It 
follows that ma — 

M— ’ (os) a T')}*! M+v 

ioe li —at_-r {Ut- 
Fr v=0 In joe x of (kM +0)! i 
oo) 7 

Now consider the series » e~¥ — |» Where y =a(t—T); if y =v+(N+1)M, we have 


that since v << M—1, eens for k = 0, 














of y"\ ly”  shenlell | 
M- \ityt.. +5 As v<M li + @4+M—1if (15) 
For k = 1,2,..., N, the following inequalities hold: 
mle yeh —1)M+1 yf tke | 
\(v+(k—-1) M41)!’ * @+kM)! 
v+kM aad y?tkM yoHk+1) M1 Me 
<oreMy orem t ter esnM_py 
but for k = N +1, only one side of the inequality is valid 
if yorN aed yrHNeDM ye , 
a; \(o+NM +41)! 7° =t™G +(N+1)M)!) <@4+(V4) MM)! (17) 


Further, for all values of k greater than N +1 (k = N+2,...), the inequalities are reversed 


Vu J yt tkM+1 yo Hk+l) M | 

”~ \o+kt +i" * e+ (e+ DM! 
ye tkM el pee Gee etal 1 a 

<+kMy << lot a-pmit te+em—py 8) 








Summing the inequalities (15), (16), (17) and (18), for all values of k, and multiplying the 
result by e~”, we obtain 
yHNHDM+i =) @ v+kM 


{ M y 
M1- Se- < > ev... < M- iter 


, yeHN+DM | 
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Now let N +00; y will also tend to infinity, and we have that 


wu- li - yo tht uo 
~ < im 2 eee a": 
pro bad (v+kM)! 


so that the equation (14) for P, can be written 


r= > 6. (19) 
0 
The mean content of the dam is given by 


cs) M-1 
é(Z)=M">Y 2 > In-v 
n=0 v=0 


x uo B (n+ (n+l) + (n+ M—1))\ 9, 
=aT+}(M-1). (20) 


(2) The stationary distribution of Z(t) for release rule of type 2 
In this case, the output function is (12), and releases are made at regular times kMa-!+T 
(k = 0,1,...). Consider the time interval 
kMa1+T <t<(k+1)Ma+T 

between two releases, where 7' may have any value (7'2 Ma-"); at any time in this interval, 
the content Z(t) of the dam has the value 

Z(t) = I(t)—I([((t- T)aM-] Ma-") 

= I(t)—I(kMa-). 

Now if we write for the probability that the dam content be n units at time f, 

P(t) = Pr{Z(t)=n} (n=0,1,...), 


then we have that 
P,(t) = Pr {1(t) -—I(kMa-) =n} 


= e—at—kMa-} saan a 
n! 


This is periodic in Ma-", so that in order to obtain the stationary distribution {P,}, we write 
*(k+1)Ma—*+7 
P, =aM— | P(t)dt 
t=kMa—*+T 


M n 

4 u+aT 

= M- e-utaT) alan du, 
u=0 n! 


where u =a(t—kMa"—-T). 


On expanding the binomial, we have 


n n—v M v 
P= yor OE yt et du 
bie oe 


M ur 
where a,= u{ ew du. 
: , ! 
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The mean of the dam content in this case is given by 


E(Z) = % 03 dno 


EQ, E (n+0) gq 
v=0 n=0 


¥ O(a? +0). 


It is easy to see that 


te) M @ u” 
> 2, = x > e*—du =1; 
out " 0 v=0 v! 
os) Mo ur 
Y v2, = M- Y ve“— du = 4M, 
v=0 0 v=0 v! 
whence the mean is obtained as 
6(Z) = aT +4M. (23) 


Comparing (23) with (20), we note that the second release rule gives a larger mean content 
of the dam than the first. 


V. APPLICATION OF MORAN’S METHODS TO PROVISIONING WITH A FINITE POSITIVE STOCK 


In dealing with provisioning, our notation will be more suggestive of the conditions of the 
problems if we write the general storage equation in the form 


S(t) = K —s(t) = c(t)—r(t)—F(, (24) 


where S(t) is the stock deficit, s(t) the stock with maximum value K, and c(t), r(¢) and F(t), 
consumption, replacement and overflow functions respectively. 

Pitt considered cases where, since components not in stock could be borrowed, the deficit 
could become infinite, so that for replacement functions always smaller than or equal to 
consumption functions (r(t) <c(t)), no overflow function was necessary, and F(t) was equal 
to zero. We shall discuss the case, in practice equally realistic, where the stock is finite and 
positive (0 < s(t) < K), so that the deficit is also finite and positive (0 < S(t) < K). This is the 
case where no borrowing is allowed, and consumption is lost between times when the stock 
becomes zero (or the deficit K) until there is a replacement which raises the stock above zero 
(or decreases the deficit below K). This will require, in cases where r(t) is smaller than c(é), 
that an overflow function operates so as to restrict the deficit to a value no greater than K, 
S(t) < K. In these conditions it is difficult to discuss S(t) for all time ¢ (¢ continuous), though 
it remains comparatively simple to consider it at specific time intervals. This was the case 
for the dam content Z(t), defined at times ¢ = 0,1,2,..., for which Moran framed a set of 
equations which gave its stationary probability distribution; we proceed to apply Moran’s 
methods to three particular problems of provisioning with finite positive stock. 


(1) Provisioning with replacements at fixed times t = 1,2, ..., and no time-lag 
This problem was designed to provide an example of an exact analogue in provisioning 
theory of the dam problem considered by Moran; in practice, however, it gives rise to a 
perfectly possible situation. Suppose that at fixed times ¢ = 1,2,..., a consignment of M 
components arrives at a store for a stock replacement; if at these times there is a deficit 
equal to or greater than M, the replacement consists of all M components, but if the deficit 
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is smaller than M, only enough components are added to the stock from the consignment 
to reduce the deficit to zero. There is no time-lag between the evaluation of the deficit at 
times ¢ — 0, and the deliveries of components for replacements at times t. This might well be 
the case for a truck containing a consignment of M components which, in its regular 
deliveries at a store, discharges for replacements only as many of the components as may 
be required. This replacement policy can be expressed by the replacement function r(t), 
a step function with jumps at times ¢ = 1, 2,..., such that 


r(t)—r(t—1) = M if S(¢-—0)>WM, 
or r(t)—r(t—1) = S(t-—0) if S(t-0)<M. 
Let the discrete consumption function c(t) for the store be defined in any interval of time 
@ by the distribution {p,(@)} (¢ = 0, 1,2, ...), such that 
(1) Pr {c(t +0) —e(t)=2} = p,(6), (25) 
5 
(2) increases in c(t) in non-overlapping time intervals are independent. 
The consumption will be met by components from stock only so long as the stock is positive 
(or the deficit less than K), but when the stock is zero (or the deficit K), no further orders 
can be met until a replacement arrives. This condition defines the overflow function F(t), 


for in the interval of time (¢— 1,t) between replacements, the overflow required to restrict 
the deficit to a value A will be such that 


F(t)—F(t-1)=0 if S(t—1)+e(t)—c(t—-1)<K, 
or F(t)— F(t—1) = S(t—1)+e(t)—c(t-1)—K__ if this is greater than zero. 
The storage equation (24) can now be written for timest = 0, 1, 2, ..., since the consumption, 


replacement and overflow functions are all defined ; this, however, presents no great advan- 
tage as it does not lead to a method of solving the problem of finding the stationary prob- 
ability distribution for the deficit S(t). 

if for t = 0,1, 2,... we write the probabilities that the deficit be i components at times t 
and t+ | respectively as 


P, = Pr{S(t)=i}, P® = Pr{S(t+1)=%}, 
where ¢ takes the values 0, 1,..., A —M, and if we write p; = p,(1) andq; = > p,, then for 
i=j 
K —.M > M the equations relating the probabilities are identical to (5), those obtained by 
Moran in his dam problem. In matrix form, we write P” = pP, where P, P® are column 


vectors with elements P,. P respectively, and p is the matrix of coefficients. Exactly as 
in Moran’s problem. the stationary distribution P,, P,, ..., Px _, is obtained by solving the 
K-M 
equations P = pP together with 3S P, = 1; these {P;} cannot always be obtained in an ex- 
i=0 
plicit form, but they can always be evaluated numerically for any given values of the {p;}. 


(2) Provisioning with replacements at fixed times 
t= kMa+T (k = 1,2,...), with time-lag T < Ma- 


(a) General equations for the stationary distributions 


This problem stems from Pitt’s second problem of provisioning under replacement policy 
of type 2, with replacement function (4). The differences arising are due to the restriction of 
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the stock s(t) to the range 0 < s(t) < K, or the deficit S(t) to the range 0 < S(t) < K, and to the 
consideration of the deficit at specific time intervals only. The consumption function c(t) 
is defined by (25) as in the previous problem, and the replacement function (4) is 


ra(t) = e([(¢—T)aM~] Ma-"), 


where an irregular number of components equal to the consumption in the previous time 
interval Ma— are ordered at regular times kMa- (k = 1, 2,...), and are delivered at times 
kMa-+T, after a positive time-lag 7’, where 7’ < Ma-". As the deficit is restricted to values 
no greater than K, consumption is lost in the intervals after the deficit reaches the value K 
until a replacement arrives to reduce the deficit to a value less than K. This defines the 
overflow function in such a way that, in the intervals kMa—', kMa—!+T’, between orders 
and deliveries, the overflow is 


F(kMa"+T)—F(kMa-) =0 if S(kMa-)+c(kMa1+T)-—c(kMa") < K 
or F(kMa*+T')—F(kMa-") (26) 
=S(kMa-)+c(kMa1+T)—c(kMa—)—K if this is greater than zero. 


Similarly, in the intervals kMa—!+T’, (k +1) Ma, between deliveries and new orders, the 
overflow is 


F((k+1)Ma)—F(kMa"+T7) =0 if S(kMa1+T7)+ce((k+1) Ma)—c(kMa1+T)<K 
or 
F((k+1) Ma) —F(kMa"+T) 

= S(kMa1+T)+c((k+1)Ma-)—c(kMa*+T)-—K if this is greater than zero. 


The storage equation (24) is now fully defined at times kMa-, kMa— + T, but, as in the 
previous problem, this does not lead to a solution of the stationary distribution of S(t), 
the deficit. We can, however, obtain sets of equations relating the probabilities 


PP. Pa 0i,Pei PR PP PP... FR. ond, PP, PC. PP... FE, 
that the deficit S(t) be equal to 0,1, 2,...,K, at the times kMa-, kMa1+T, (k+1) Ma“ 


respectively. Writing p; = p,(T'), and q; = > p;, we have for the times kMa~, kMa* +T, 
the relations i 
PY = pol Pot+ Pit... + Pes) + GPx: 


PP = p,(Pot+ Rt... + Peo) +h Pr: (27) 
P2 = cP; 
or in matrix form, P® = pP: 
PE he, «Mac eseslccwe [leer online ili dcl ate 
PY? Py Py «+--+ Dy G 9 P, 
PP 11 Pe Ps 90 Parad il P, . (28) 


PY Gar 2 Beevask cae .4as adi SRP 
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ao 
Similarly, writing pf = p,(Ma-1—T) and ¢ = > p{, we have for the times kMa*+T, 
(k+1) Ma-, the relations — 
PP = 2 PY, 
PP = pi PP +28 PP, 


(29) 
PR = oP PP + 02-1PP+... +0 PY, 
or in matrix form, P® = p® P®; 
PE pO So jwise =| PP 
P® pop 0 Tass Ls Eales BP 
PP} =| oP oP oP o.. 0 |] PP]. (30) 
PR de (2-1 QR-2 -- «. @I \PR 


The relations between P® and P, the probabilities of S(t) at consecutive ordering times 
kMa-, (k+1) Ma-, are then given by 


P® = p™pP, (31) 


where the matrix p™p is square. For ordering times, the stationary probability distribution 
Py, P,;..., Px can be found by putting P? = P, for i = 0,1,...,K, and solving the set of 


K 
equations P = p®pP together with >} P, = 1. This solution may not be obtained easily in 
i=0 


an explicit form, but it can always be computed numerically for any given distribution 
{p,(A)}, i = 0,1,.... We note, in passing, that this stationary distribution for ordering 
times will be easier to compute if we first obtain the stationary probability distribution 
P}, P®,...,.PQ, for delivery times. Writing P® = Pr{S((k+1)Ma-!+T7) =i} for the 
probability that S(t) have a value i at delivery time (+1) Ma-*+T7, we have that the 
relation between P® and P®, the probabilities of S(t) at consecutive delivery times 
kMa‘'+T, (k+1)Ma+T, is given by 


P® = ppoP®, (32) 
where the matrix pp” is triangular. The stationary probability distribution for delivery 
times, Pt), PP, ..., P®, can be found by putting P® = P fori = 0,1,...,K, and solving the 

K 
equations P® = pp®P®, together with ¥ P? = 1. For numerical computation of P, 
i=0 


the work is considerably lightened by the fact that pp™ is triangular; once the stationary 
probability distribution for delivery times is obtained, the simple relation P = p®P® will 
enable the stationary distribution for ordering times, Po, P,, ..., Pg to be found. 


(6) A numerical example 


As an illustration of the computations necessary to evaluate the stationary distributions 
{P;} and {P?}, for ordering and delivery times, we construct an example in which the con- 
sumption function c(t) has, associated with it, a Poisson distribution such that 

aQ)é 


pA) = ew a (i = 0,1,...). 
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We choose the maximum deficit K = 9, the time-lag between order and delivery 7' = }, So 
the time interval between orders Ma-' = 1, and the mean consumption per unit time a = 4. fa 
Then, since 7’ = Ma-!—T = 3, we have that the p; and p\” defined previously are equal, 
p= pp =e Qi! (¢ = 0,1,...). 
From Molina’s tables for Poisson’s Exponential Binomial Limit, we obtain 
Po= Pe = 0135335, qg= qf) = | 
P, = pP = 0-270671, = qf) = 0-864665, 
Po = p? = 0-270671, gq. = gi = 0-593994, 
Ps = ph? = 0180447, gy = gf? = 0-323324, 
re 
Py = pP = 0-090224, gq, = gy? = 0-142877, (33) 
‘ad , 
Ps = pS? = 0-036089, g; = qi? = 0-052653, 7 
sc 
Pe= py = 0012030, gq = qs = 0-016564, 
Pr = py = 0-003437. q7 = Gg = 0-004534, 
Ps = p? = 0-000859, gz, = gf? = 0-001097, 
Io = 7 = 0-000237.) | 





The ne giving the stationary distribution {P‘”} for delivery times are P® = pp®P, 


together with 5 P® = 1; these, in the case of the { p;} and {q,} given in (33), result in eleven ( 
i=o ¢ 
equations, not all independent, for ten unknowns: 


— 0864460 PW + 0-136284P + 0-139256 PW + 0-149657PY + 0-180862PY ci 
+ 0:258876P + 0-414902P + 0-645941 P® + 0882981 PY + Py? = 0, fe 
0-271 117 PW — 0-727584P + 0-276590PS + 0-287625PP + 0-310012PP P 
+ 0:339183P + 0-343934P + 0-270671P + 0-117019P = 0, 7 
0-271486PM + 0-273334P — 0-722144P® + 0-285591 PY + 0-290341 PP 7 
+ 0-270671P® + 0-197408P + 0-080388P aie : 
0-181348PH + 0-182615P + 0-183837PP — 0-819553 PY) + 0-160777 
+ 0-111934P + 0-043756PP = 0, 
0-090630P + 0-090224P + 0-086834P + 0-075304PY — 0-949117PY I 
+ 0-019336PY = 0, 
0035683 Pi? + 0-033922P?? + 0-028904 PY) + 0-019136PY + 0-007126P) 
ve a) = 0, 
0-011129P + 0-009368 PP + 0-006111 PY + 0-002242 Py? 
— PP? = Q, 
0-002623 PtP + 0-001692P? + 0-000614P? 
bs PY = 0, t 
0-000413PP + 0-000148P? i : r 
0-000032P® y : 
9 = 0, : 


PP+ PY + PP + P+ PP 


(33) 
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Solving these equations by straightforward elimination, which is greatly simplified by the 


fact that pp is triangular, we obtain the following values for {P??}: 
PP = 0-148685, PP = 0-026833, 
PP = 0-281325, PP = 0-006382, 
PY = 0-277080, P® = 0-001036,} (34) 
PY = 0-177621, PP = 0-000103, 
PP = 0-080930, PP = 0-000005. 





From these we can compute the stationary distribution {P;} for ordering times, using the 
relations between P and P® which are given in matrix form by P = p®P. This is easier 


K 
than solving directly the equations P = pYpP and > P; = 1, where the matrix pp is 
square. We obtain for {P;} the values: sid 


P, = 0-020122, P, = 0-154360, 
P, = 0-078318, P, = 0-099024, 
P, = 0-153890, P, = 0-053655, 
P, = 0-202012, P, = 0-025004, 
P, = 0-198206, P, = 0-015410. 


(c) Comparison with Pitt's results for an infinite deficit 


It is interesting to compare the stationary distribution of the deficit S(t) in the present 
case, where it is finite and restricted to the range 0< S(t) < K, with that obtained by Pitt 
for an infinite deficit with the same types of consumption and replacement functions. Pitt’s 
problem for the infinite deficit, when the consumption function is of Poisson type, and the 
replacement function is the 7,(t) defined in (4), is the exact analogue of the problem of the 
infinite dam considered in §1V (2). With a slight change in the notation of (22) to avoid 
confusion, we write the stationary distribution of the deficit S(t) when its range is 0 < S(t) < 90 
as {II;} (¢ = 0,1,...), where A 
Il; = % Di-vQ- (35) 


(aT')i-” 


Go! and the Q,, are given by 


Here, the p,;_,, = e~*F 





We proceed to obtain a similar stationary distribution, also averaged over the time in- 
terval Ma-", for the present case of the finite deficit. and we shall write it {IT{?} (¢ = 0,1,...,K). 
To do this we consider the time interval 6, where 6< Ma such that kMa!+7'+0@ is 
any time between two deliveries at times kMa- +7, and (k+1)Ma-+7. Let the prob- 
abilities that there be consumption 7 in this interval be 

i 
pO) = 00 8) 


t 
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then if {P,(kMa-!+7'+6)} represent the distribution of the deficit S(t) at the time 
kMa- + T +6, this will be given in matrix form by 
P\(kMa"+T +6) Po) 0 ar adie 0 PP 
P,(kMa+T7'+6) p(9) Sayed, Tah ph) =A 0 PP 
Px_,(kMa*+T +0) Pr-i9) Pr-2(A) Po(9) 0 PR 1 
Px(kMa*+T +0) Qx(9) 4x-1(9) (9) (9) PR 


where q,(6) = ¥ p,(9), and Pf, P?,..., PQ, are the stationary probability distribution of 
i=j 


> 


the deficit at delivery times. We can write these in the forms 


P(kMa+7+0) = > P®,p,(0) (i =0,1,....K—D), 
v=0 


: EK 
Px(kMa+T7 +6) = > P_,q,(4). 
v—0 


The stationary distribution {II} will then be obtained from the {P,(kMa-1+T7'+6)} by 

averaging them over the interval of time Ma-', so that 
Ma-* i ’ 

np =aM-| °" y PP.p4(6) 48, 

, 0 v=0 
= > P®.2,.° (¢=0,1,...,.K—1); 
: Ma K P (36) 
Mp =aM-{ "SPR cag(0) db, 
v=0 





K 
= 2 PR» Q;, 
v 


where there is some similarity of form with (35). 

In the case of our numerical example (§ V (2)(6)), where K = 9, T = 4, M =a = 4, the 
values of the p;_, in (35) are given by equation (33) for values of 7 = 0,1,...,8; the values 
of the Q,, in (35) and (36) are given for v = 0,1,...,8 by 

1» = 0245421, Q, = 0-141633, Q, = 0-027669, 
Q, = 0227106, Q,=0-092791, @Q, = 0-012784, 
Q, = 0190474, Q, = 0-053718, Q, = 0-005341; 





and the values of the {P?} are given by (34). 


Il, = 0-033214, 


Using these, we compute 
II = 0-036490, 


II, = 0-097164, [If = 0-102810, 
II, = 0-153677, [If = 0-160212, 
Il, = 0-176480, If = 0-181162, 
Il, = 0-165573, TIP = 0-166619, 
Il, = 0134440, I = 0-132132, 
Il, = 0097291, If’ = 0-093169, 
Il, = 0-063731, 12 = 0.059328, 
II, = 0-038132, II = 0-034385, 


> Il, = 0-040298, 
i=9 


Tif’ = 0-033693. 





me 


by 


36) 


the 


Les 
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There would appear to be a close similarity in the values of the {II,} and {I}, so that we 
could use the more easily computed values of II; as approximations for the II; however, 
it is not possible without further work to state within what bounds this approximation 
would be valid. 


(3) Provisioning with orders at fixed times t = kMa-, and replacements at 
jixed times t = (k+1) Ma (k = 1, 2,...), with time-lag Ma 
(a) General equations, and an exact solution for the case of consumption with a discrete geo- 
metric distribution 
This is a special case of the previous problem (§ V (2) (a)), when the times for delivery of 
the replacements and for ordering coincide; that is, a new order for replacements is put 
in every time a delivery occurs, at time intervals Ma—'.:'The consumption function is of ike 
same form (25), the replacement function is r,(t), (4), the overflow function is defined by (26), 
where the time-lag 7' is Ma~'; in these conditions, with the same notation as before, the 
equations (28) relate the probability distributions {P,} and {P®} for S(t) at consecutive 
ordering and replacement times kMa~, and (k+1)Ma- (k = 1,2,...). To obtain the 
stationary probability distribution P,,P,,...,P,, for these times, we write P® = P, 


v v 


K 
(¢ = 0,1,..., K), and solve the equations P = pP together with > P; = 1. 


i=0 
If, for convenience, we group these equations in a slightly different way in the top half, 


np Py = PolPot-+-+Px)+aPx, | 
P, = py (Pot... + Pry) +92Pr- 
Pra = Peralyt dx, 
Pr = 9xP, 
1=PRj+ht+...+Px, J 
which can be solved in pairs, starting with P, and P,, and working towards the centre of 
the group in the order P,, Pr_,; P,, Px_2; etc. We obtain the recurrence relations 
Po = Pll —-1 9x), 
Pe = dx Py = Ix Poll -—U 9K)"; 
P, = {p(1— Px) + G2? x-1 Po} (1-—929K-1) > 
Pro = PraaPot+ dah: 
for the first two pairs of solutions. Assuming that all probabilities P, from P), P,, ... to Py_,, 


and Pr, Px_,,... to Pg_;,, are known, the following formulae enable all other P; to be found 
from these by a repetition of the process: 


EK i-1 \ 
P,= {p.(1 a »,) + Visi PrR-i = P| (1 Headed | 


> (37) 





n=K-—i+ 


a (38) 
Pri = Pr-i D Ent 1K -i%% (¢ =1,...,K—1). | 
n= 


A distribution {p;} which gives a simple explicit solution of these equations is the geo- 
metric; it is important to note, however, that this does not satisfy the independence con- 
dition for non-overlapping intervals of time postulated in (25). For the geometric dis- 


tribution p,= AB (i =0,1,...), ni 


13 Biom. 42 
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where A + B = 1, we find from the recurrence relations (38) that the stationary distribution 
of the deficit {P;} is given by 


P, = AB(1—BEH)-1 (Gj = 0,1,...,K). (40) 


(b) An exact solution for the case of consumption with a continuous exponential distribution 


Suppose that instead of dealing with integral units of the deficit, consumption and 
replacement, we deal with units of magnitude A; A will tend to zero, and K and : will tend to 
infinity in such a way that the maximum deficit KA tends to the value k, and a deficit, 
consumption or replacement iA tends to the value s (i = 0,1,...,K). The distribution of 
the consumption then becomes continuous; for if we put 


A = 1—e-#4, 
B= e*4, 


and let A 0 in (39), so that iA, we have that the distribution of the consumption in the 


interval of time Ma-! is 
p(e)ds 


lim A-1p;,ds 


A—0 


= lim A-1e-4#4 (1 — e-#4) ds 
4-0 


= pe ds, (41) 


which is continuous, and of the exponential form. The solutions (40) of equations (37) will 
lead by the same limiting process to an exact solution of the following provisionjng problem. 
A deficit S(t) is defined continuously in the range 0 < S(t)<k, at the fixed times nMa-! 
(n = 0,1,...), and is subject to the following conditions: 
(1) a consumption function c(t) such that in a time interval Ma the distribution of the 
consumption is continuous and of the exponential type 


p(s)ds = Pr{s<c((n+1)Ma-')—c(nMa-") <s+ds} 
= pe“ ds; 
(2) a replacement function such that 
ro((n+1)Ma-) = c(nMa-), 


where irregular quantities lying between the values (s,s+ds) equal to the consumption in 
the previous time interval Ma-' are ordered at times nMa- and delivered and added to 
the stock at times (n+ 1) Ma-; 

(3) an overflow function such that in any time interval Ma- between ordering and 
delivery times nMa-, (n+ 1) Ma-", the consumption is lost at any time after the deficit 
rises to the value k, and is otherwise met from the stock. 

For small A, the equations (37) hold, with the proviso that { p;} represent the probabilities 
that the consumption in an interval Ma~ is iA, and {P;} the stationary probabilities that 
the deficit have a value iA, so that the solutions (40) to them also hold. In the limit as A 
tends to zero in the manner defined above, the continuous stationary distribution of the 
deficit S(t) is given by 

f(s)ds = lim A P,ds 


A> 


= “e-*(1— e-¥*)-1ds_ (0<8<k), (42) 


a truncated exponential distribution. 


the 


n in 
1 to 


and 
ficit 


ities 
that 
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the 
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This solution can also be obtained by taking the limit as A tends to zero for equations (37) ; 
this gives an integral equation for the continuous stationary probability distribution of 


the deficit, k-s co 
f(s)ds = p(s) ds | | fluydu+fik—s)ds [” pluyda. (43) 


It is easily verified that (42) will satisfy this equation. 


VI. AN EXACT SOLUTION OF THE STORAGE PROBLEM WITH A POISSON INPUT 


In attempting to find algebraic solutions for the stationary probability distribution {P;} of 
the dam content Z(t) in Moran’s problem (§II), where P = pP, P being the column vector 
of the P,, it is found that no simple solution in an explicit form exists for an input function 
of the Poisson type with parameter a, such that in the interval of time ¢, t+ 1, the prob- 
abilities of an input X(t) being ¢ units is 
p,=e*atii! (é = 0,1,...). 

However, if in this problem the conditions for the input of Poisson type and for the overflow 
are left unchanged, but the release rule is altered so that instead of the release rule (3) of 
§ II (M units at times t+ 1 if the total content of the dam Z(t) + X(t) is greater or equal to VU, 
or the total content of the dam Z(t) + X(t) if this is less than M), a new release rule is pre- 
scribed so that there is an output with a steady flow when the dam contains water, and no 
output when it is empty, then Z(t) is continuous, and it is possible to obtain solutions in 
an explicit form for its stationary distribution. 

Our method of approach will be to consider Moran’s set of equations (5) for an infinitesimal 
interval! of time dt, instead of a unit interval of time; the input will remain of Poisson type, 
and be a multiple of a definite discrete unit which will not tend to zero, while a small discrete 
release will be made at the end of each infinitesimal interval of time under a rule of the same 
type as rule (3) of § II. The equations can then be solved explicitly, and when d¢— 0, these 
solutions will provide the continuous stationary probability distribution for the dam content 
Z(t), which is then itself continuous, when the release rule prescribed gives an output with 
a steady flow when the dam contains water, or no output when the dam is empty. 

Before proceeding further, it is preferable, since the theory of dams was identified with 
the general storage theory, to view the problem as one of the general storage function. It 
then becomes possible to interpret the problem as one of provisioning as well; here we would 
have a consumption of Poisson type, a replacement rule such that there is continuous 
delivery at a steady rate when the stock is not at its maximum value, and zero when it is, 
and an overflow function such that the consumption is lost when there is no stock to meet it. 
In practice, this could conceivably be the case of a stock of grain in a silo, where the replace- 
ment is a steady flow of grain which is stopped when the silo is full, and the consumption of 
Poisson type is lost when the silo is empty, and met from the stock otherwise. 

We frame the conditions of the problem for the storage function as follows: a discrete 
storage function S(¢) is defined in the range 0 < S(t) <(K—1)A, where K = 6H+U (b,H,U 
integers) at the fixed times ¢ = 0, df, 2ét, ..., and is subject to the conditions that 

(1) it is fed by a discrete input function J(t) of Poisson type with parameter a, such that 
in a small interval of time d¢ there may be inputs of HA or of no units respectively with 


probabilities p = Pr{I(t+6t)—I(t) = HA} = 1-e-%, 
gq = Pr{I(t+dt)—I(t) = 0} =e; 


13-2 
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(2) the output function is such that at fixed times (n+ 1) dt (n = 0,1,...), the discrete 
output consists of a quantity A = C ét units, where C is a constant, provided that the storage 
function at a time just previous to (n + 1) dt is greater than A, 

S(n dt) + I((n + 1) dt) —I(n dt) > A, 
but is zero otherwise; 
(3) the overflow function is such that in the interval n dt, (n + 1) dt there is no overflow if 
S(n dt) +1((n + 1) dt) —I(ndt) < KA, 
but there is an overflow 
S(n dt) + 1((n + 1) dt) —I(ndt) — KA, 
if this is greater than zero. 
Writing P;, P{? for the probabilities that S(t) take the values iA at times ndt, (n+ 1) dt 


respectively, where i = 0,1,...,K —1, we have that the relations between these prob- 
abilities are 


Po? = qPy+9F,, 








—.* (44) 


1) wa 2) 
Po +yH-2 - PPyy-1 +ql +1) H-1> 


COOTER REET O OTHE EEE OOO HEHE EEE H HEHEHE EEE EEE HEHEHE EEE EERE HEE HEHEHE EEE EEE EEEE EEE EEN HEHE EEES 


1) — al 2 ] 
PR 1 = Pyiz+u-1 = P(Po_y a+u te + Pyyu-1), 





or in matrix form, P® = pP, where P, P® are column vectors with elements P,, P? respec- 
tively, and p is the matrix of coefficients. 
Formally, this is Moran’s set of equations (5), in which we have set py = 7, Py = p, and 

all other p; = 0 for i not equal to 0 or H. For the stationary distribution of S(t), we put 
K-1 

P® = P,, and solve the equations P = pP, together with S P, = 1. Wesee clearly that the 
i=0 

solution of these equations will fall into the (6 + 1) distinct classes: 


(0)th class: PB, Py, ..., Pas; 


(1)th class: Py, Pris, ---> Poni; 


(j)th class: P., PB, 


j jH+1) <**? Posy H-13 


(6)th class: Pyy, Pyss, ---» Pres: 


bt 
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The solution for the Oth class is immediately obvious from (44), 


Pr=pPoq-® (R=1,2,...,H—-1); (45) 
for the jth class, we obtain from our equations (44) the equation 


R 
Pur = | Pry 1p E,PPe-vass} (j=1,...,6;R=0,...,.H—-1), (46) 


relating any probability P,,,, in the jth class with the probabilities Py_)47., Piy—, of 
the (j — 1)th class, so that starting with the solutions (45) for the Oth class, we can obtain 
solutions progressively for one class after another. 

Consider the formula 


ran Por otn( Sarge iyl(G-OMeRoe=3) (Sone Rena) 
(47) 


.,4i--1 for j = 0, and R = 0,1,...,H—1 for j = 1,2,...,b, and where it is 


(") = Oifr<0, orifn<r. For j = 0, this gives 


» 
Pr=pPyq*® (R= 1,2,...,H—1), 


where R = 1,2,.. 


understood that 


which is the solution (45) for the (0)th class. We prove that formula (47) holds for all values 
of j by induction ; assume that it holds for the class (j— 1), then substituting for Pj_» 74 
and P;,_, given by (47) in the equation (46), we obtain 


a af ai Vee (j-v)H+0-2) | ((j-v)H+0-2 
Base ~ tal Pgs & (—1)°’p agra) ( J ry a" )+o( J » )}) 


— p*P,q-i-DH $ (= — 1)? pe-lgh#—» 


e am see ' +2 l—-v)H+N+v- ‘)\) 


v—1 v 


= parte l +E (--1 yr pr tqeat-0| (Y —v)H+0- * 


r 3 hie Aster 2 ap (U-92-2) = (6 gape ae 
v—2 v Nao v— 
scares $95 an 8 4-8) “ 


Now in (48), consider 


pre pe eae 
po 3 ()- 2.00) - (24): 
(#= vid *) + 3 (RAY t25%) 
+ tad N=0 n—2 
“(rns ge Eg 


= tl nies aa ‘). 
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In exactly the same way, we reduce the other terms in (48), so that 


L(Y inte Sone prepa Ty greater a eppeadensagealss? 
wt BC Se eB Fn Cpe JF, ) 


We thus obtain for P;,,, in (48) the following expression: 

Pya+r = pPygqg tt+® 

j ae 24 — 1 

. Pi > (- Lye pe-tgre-n) (1 v)H+R+v . +p(\ v)H+R+v ))}. (49) 

| v=1 v—-l ] v ] 

which is equivalent to (47). Now. since this has been shown to hold for the Oth class, then 

it will hold for the 1th class, and so on up to class (6); this proves that (47) (or the slightly 

different form (49)) is the solution required for the stationary distribution of S(t) in the 

discrete case considered. We write out in full the formulae for a few classes so as to give an 
idea of their structure: 


Ph =pP,q-* (R=1,2....,H—1); 
Pyir = pPyq +B) l =9"-(1+p(1)}]. 


Pare = wR eeem yt — ata o( 7") + ota (OT ') +05 ))} 


rn wart -aueg( 8) mak) 8) 
— p'grt-» a " - o( R+ 2 


3 ))| (R =0.1,...,H—1) 


For the complete solution in any particular case, the value of P, is aiso required; this can 
K-1 
be found from the equation ¥ P, = |. 

i=0 


We now proceed to the continuous case by letting dt 0, and consequently A = Cét—>0, 
where the constant C can now be interpreted as the rate of output per unit time. We allow 
A to tend to zero. and the previously defined integers K, H and £& to tend to infinity in 
such a way that 

(1) the maximum value of the storage function KA—>k, or (6H + U)A>bh+u, 

(2) the input HAA, 

(3) the values of the storage function (jH +R) A—+jh+r. which we shall also write s 
(j = 0,1,...,6; O<r<h). 

As a result of this limiting process, the conditions of the problem which we have just 
considered for the storage function are changed; the function S(t) has now a continuous 
range of values, 0< S(t)<k. where k = bh+1u, and is defined for all time ¢ (¢ continuous) 
in such a way that 

(1) it is fed by a discrete input of Poisson type with parameter a, so that in a small 
interval of time dt there may be inputs of h or of no units with probabilities 


p = Pril(t+dt)—L(t)=h} = 1—e- = 1 —e4aO4 = 1 ~¢- #4, 
q = Pr{I(t+ dt) —1(t) =0} = eo = eai0)a = eH. 


where « = a/C. 





wn 


in 
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(2) the output is such that it has a constant rate C per unit time, provided the storage 
function is not zero, but is zero if the storage function is zero; 
(3) the overflow is such that in any interval of time é¢ there is no overflow if 


S(t) + I(t + dt) — I(t) <k, 
but there is an overflow 
S(t) + 1(t+ dt) -—I(t)-—k 
if this is greater than zero. 
We write the continuous stationary probability density for S(t) as f(s), so that 


f(s)ds = Pr{s< S(t) <s+ds}. 
Now for s = lim (jH + R) A = jh +r, we have that 
4-0 
f(s)ds = lim Piz, pAu'ds; 
A—>0 
this enables us to obtain from (45) and (49) the equations for f(s) in various ranges of magni- 
tude h. For j = 0 (0<r<h) we have 
f(r) = lim A-pPyq-* = lim P,A-(1 —e-#4) e488 
A—>0 A—>0 
= pe P,. (50) 


For j = 1,2,...,6 and 0<r<h, we have 
j 
f(gh+r) = lim A-tpPyq H+ + > (-1)p*-1g*4-» 
A—0 v=1 
{(((j-v)H+R+v0-1 (j-v)H+R+0-1 
i +p 


v—1 v 


lim A-"(1 —e-H8) nH P(1 + 5 (= iype-rete-a 


= x lnay(J- OH Be ") + (way (F-aran ')}) 





v—l 
, j ba h v—1 v h a. 
= peihtr) R(t + = (-—1) ern yr ae ie 44 H((9 <2 +r)? \). (51) 


In addition to these probability densities, which are continuous in the intervals in which 
they are defined, there is also a concentration of probability P, at s = 0. To complete the 
solution, we require the value of P,; this will be given in any particular case where the value 
of k = bh +u is known by 


Pyt+ x * fGen dr+[" f(bh+r)dr = 1. (52) 
r= 
We write out in full some of ‘ae formulae for the f{jh +), in order to indicate their structure: 
f(r) =pe"P, (0<r<h), 
fth+r) =peen P{l—e™ (1+ yr}, 


2) 
fl2h-+r) = pore Pl 1— erm (henyenem(r+a5)} (53) 
7 0 





f(3h+r) = penenn PI —e-#h(1 + (2h +r) +pem((h +r) +ult 8 *) 


2! 





2 
- eT +)! (O<r<h). 
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It is interesting to note that these formulae for f(jh +1) (7 = 1, 2, ...,6) can also be obtained 
from the limit of equation (46), when A> 0, 


fGjh+r) = lim Py. pA 
A—>0 ‘j 
= lim Py At pAs Y gM(Py-vaw)Al. (54) 
A>6 V=0 


Now for j = 1, we have 


A-0 


R 
= tim ¢®+9| Py A pA g(PyA~!)A—pA1P,| 
N=1 


A-0 N= 


(55) 


pr 
= e/ f(h—0)—yu en | e-"" Fin) dn + Phi : 
Jo 
since P, A~-!f(n), where NA-—7n for all N +0, but there is a concentration of probability 
P, ats = 0. For) = 2,3, ...,5 the equation (54) gives 


f(jht+r) =e" f(gh-—0)-4 en | "eHn f(j— l)h+n)dn. (56) 
0 


The formulae (53) can be obtained by putting the formula (50) for f(r) in (55) and inte- 
grating it to find f(h +7), then repeating this process for f(jh +7) by using (56). More gener- 
ally, it can be proved by induction that our solutions (50) and (51) satisfy the integral 
equations (55) and (56). 

It is found that (52) does not lend itself to the computation of P, as well as an alternative 
equation for the integral of the probabilities, which we now obtain. To do so, we write the 
probability density f(s) for the various intervals in the following manner: 


fle) =ne*P, (0<s8<h); 





(v—1)! B v! =|, 
(jh<s<(j+1)h;j =1,2,...,6). 


f(s) = pon P,{1 + > (-—1)° eH ya leet MO =r) 
v=1 


The equation (52) can then be written as 


bh+u b : 
P(i+{ pe®ds+ > (-1 
j=1 


0 8=j 


bh+u Ae ie —jhyi-? —_(s—jhy) 
ua—jhy) j-1 (s- J . j ~J! | as) = 1, 
ee ee ee 


which shortens considerably the work involved in the computation of Pp. 


I am greatly indebted to Prof. P. A. P. Moran for suggesting several of the problems, and 
for his extremely helpful discussion and criticism of the work throughout all stages. 


REFERENCES 


FELLER, W. (1950). An Introduction to Probability Theory and its Applications. New York: John 
Wiley and Sons. 

Frazer, R. A., Duncan, W. J. & Couzar, A. R. (1947). Hlementary Matrices. Cambridge University 
Press. 

Moran, P. A. P. (1954). A probability theory of dams and storage systems. Aust. J. Appl. Sci. 
5, no. 2, pp. 116-24. 

Moray, P. A. P. (1955). A probability theory of dams and storage systems. II (to appear). 

Pirr, H. R. (1946). A theorem on random functions with applications to a theory of provisioning. 
J. Lond. Math, Soc. 21, 16-22. 





ve 


he 


ind 


ing. 


[ 201 ] 


APPROXIMATE CONFIDENCE INTERVALS 
III. A BIAS CORRECTION 


By M. 8. BARTLETT 
University of Manchester 


In my second paper on approximate confidence intervals (Bartlett, 1953, to be referred to 
as II), where a general method of eliminating ‘nuisance parameters’ was discussed, it was 
claimed (see top of p. 310) that the effect of substituting maximum-likelihood estimates 
for the nuisance parameters could be neglected to the first order (O(1/,/n)) of approximation. 
I regret to say that this assertion is not in general correct, and the statistic T, obtained after 
such substitution may have a bias of O(1/,/n) which requires correction. It is even necessary 
in the expansion aa or 


oT 
T, = T+(9, — 4.) 2) 9, + $/ (6,— —6,)? 303 (1) 


to make allowance not only for the second term, but also for the third term, on the right-hand 
side of (1), or to use the equivalent approximation to this order given in equation (14) of IT. 
For convenience we may write (cf. equation (11) of IT) 


cL, OL 
wn 7 2 
T=a 70, +bxe (2) 
where a= 1/4/11 .2,6= — Lo/(Loe v1.2). Then, noting that ¢7'/00, is O(1), and 6?7"/063 O( Jn), 


we find, omitting any terms of smaller order of magnitude than 1/,/n, 
a2 a2 : 7 a 
T_T Al eL eL ob éa a oe 





~ J,,00,|" 00,00, ° 002 * 00,00, 00,00, 
_ 1 (ab\? { PL | eL) Ca ab 
+ 378, (=,) lax \50, 008) +O” \ag8)~ 74299, —24e5g,|- (3) 
Hence, making use of relations such as 
jol OL | _ | ®L | che 
\30, 00,004) ~~” \80,008| ~ 30,’ (4) 
we obtain from (3) 
us aL > Ole 
MTD = > ar valle caw 9 +h | 
(earl) | *) 
~ Thy [2 \208| +? 20, +0(;). (5) 
When J, = 0, this reduces to 
Bin) ~ — 40 ON | te Vn) (6) 
e \00, 063| 22 V*11/° 
For reference, the generalizations of (3) and (5) in the case of several nuisance parameters 
@; (t = 2,...,m) are 
, ob eL oL  oOLdA oLos; 
A 0,4 30,00, * ®20, a0, * 55.88," 60:60 
oL oL | 


a4 .. 
peste a j 
+i" 6,00, 10 AB| 5 5.38, * 29a 75, 30,)— he ag, — a6, ‘|, -- 
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where oL OL 
TzA 50, + B; 30, 
say, and 
. E{T}~ —4A4 | Ds 2ym | BB [z i hh 4. oie Lin (5a) 
. L100, 00,00), 00; j|" \20; 00,00), 00; |’ 





the summation convention being understood for g,h, i,j,k (2, ...,m), and J being the inverse 
of I;;. 

Provided T, is corrected for its bias, it is found further from the formulae (3) and (3a) 
that its second and third moments are then the same as those of T to O(1/,/n). 

Examples. (i) and (ii). The two examples discussed in §§ 6 and 7 of II are not affected by 
this correction. In the analysis of variance problem of § 6, 7,,+0, but it will be found that 
E(T.) ~ 0 from (5) above. (I have learnt that more direct investigations of the asymptotic 
confidence interval for this problem. using the methods first developed by Dr B. L. Welch 
in the Behrens-—Fisher problem, have recently* been made by J. R. Green and by A. Huitson; 
and, in particular, Dr Huitson has checked the consistency of my solution with his own 
results. ) 

Similarly, no bias correction arises in the time-series problem of §7, though there is 
unfortunately a further misstatement in equation (23), which should be corrected. The 
maximum-likelihood estimate of o}-=, say, should have been given as 


& = >> (X,—AX,_,)?/n. (7) 
r=1 


The confidence interval from (24) then becomes 
pet und pve! ote 
p= Bs v-f)+" +0(5) (8) 


(cf. my contribution to the discussion at the Royal Statistical Society Symposium on 
Interval Estimation, 12 May 1954). 

(iii) To illustrate the use of the bias correction where it is not zero, consider the last 
example above when the mean value m of X is not zero and requires estimation. The log 
likelihood function (apart from end-corrections of relative order 1/n) is now 


n 1 2 we 
L= —gloga—s ~ watts <3 (9) 
where Y, = (X,—m)—f(X,_,—m). (10) 
The L derivatives with respect to a and / are as before, with £,= X,—m in place of X,. Also 
oL (1-£) 2 


= nt 
om rhe (€, — BE,-1). 
m~ > X,/n, 
r=1 
Fn “—_ — py nlx, Tea = 0, Lom = 0, 
@L { ®L | 
aliens oie D " “) : 
apome = 21 Alla = Bia pesy 
Hence the quantity n : sect 
X)— BXi_,) X,_,/@’ 
ws = (X,—BXp a) Xr a) 


SCC.) ae 


* See references. 


(11) 


~~ 


11) 
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where X; = X,—%, and @’ is like @ in (7) but with X‘ in place of X,, has bias 





1 
K{T,} ~ — —.~—;- 
{ J (1—£) VI pp’ 
and the equation for # becomes 

ae A) BUA), 12 
a —f) yn v{(. — B?) n} ar si 

a ai A 
or p= Bs y—By+ +0(-5) (13) 


in place of (8), where now 

A n n 

p= X,X,/ Ztail 

r=1 r=l1 
(iv) Finally, in view of the above amendments, it seemed advisable to check these results 

to O(1/./n) by means of a fairly complicated example for which the result is already known. 
A useful case is that of the classical correlation coefficient p, with the two standard devia- 
tions « and f# as nuisance parameters. We take, apart from an additive constant, 





n 8] 2pT8,8, 83 
L = x nlog {apa ( 1—p? | l—p? 5 [ar op +4, (14) 
where 7, 8,, 8. are the usual sample estimates of p, a and / respectively (the means for con- 
venience are already eliminated). Then (cf. M. G. Kendall’s Advanced Theory of Statistics, 
vol. 2, ex. 17.18) 








n(2 — 9?) _ n(2—p?) _ M(1+p") | 
x2 ~ (1 —p p2)’ BB f?(1 —p?)’ tail ( =p | 
(15) 
Si sina’ np* snip pres. | 
W aB(i—p%)’ **  all—p)’ Bil —p?) 


It is found further from (14) that 








at nd WR. 

\da3) ~~ -x3(1 —p?) * \ea2eBl ~~ a B(1 —p?)’ 

| AL | _ —2mp(i—p*) pj _@L | _ mp(i+p*) | (16) 
sa Ca2ep| (1 — 2)?” \Cachepi = xB(1 —p?)?’ 

(2 a 2n(1 +p?) jeeL\  —4np(3 +p?) 

\exdp?) —-x(1 —p?)?” “\dp3| (pty ’” 





the expressions for the remaining quantities with x and # interchanged being obvious from 
symmetry. 


The statistic T uncorrelated with 6L/éx and éL/df is 


oL ap ¢L Bp s/o 
ral; p 2(1—p%)da * xi 5 oe) \op.ap" (ua) 


where J), = ”/(1—p*)*. From the above formulae, and the general expression for 


ek oh eb) {iM a} —_ d 
\30,00, 20, in terms of 4 \5, 20,00, and derivatives of J;; given in IT, it is found (after 
some algebra) that the skewness of 7' in (17) is zero to O(1/,/n). 


We now replace 7’ by 7,, the maximum-likelihood estimates of « and # (p known) being 


a=a, /(7=4). p=s, |(=%). (18) 














204 Approximate confidence intervals 


T in (17) then reduces to 


__vm jeLl\ _ yn(r—p). 
m=; a(5),7 Toe (19) 


the functional form for r in (19) should be noted. From the general bias formula (5a), we 
find (noting that [** = }(2—?) a?/n, I*? = }p*af/n) 
E(T.) = $p| Vn. (20) 


Hence our confidence interval is given by 


ED Rega (21) 
l1—pr 2/n Jn 


which agrees with the known results 

E(z) ™ C+ p/n, o*(z) ed I/n, V1 (2) i 0, 
where z= blog. — € = log, — 
While the above method is not of course required for this example, it is of some interest that 
the result (21) has been obtained merely by straightforward differentiation of the log likeli- 
hood function (14). It is, however, apparent that the algebra involved in getting the next 


term in the expansion becomes in general so intractable that a more direct attack on in- 
dividual problems is then usually the more promising. 
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THE THEORY OF CORRELATION BETWEEN TWO CONTINUOUS 
VARIABLES WHEN ONE IS DICHOTOMIZED 


By ROBERT F. TATE 
University of Washington+ 


1. INTRODUCTION AND SUMMARY 


The problem of biserial correlation arises when one is sampling from a bivariate normal 
population in which one of the variables has been dichotomized, giving rise to only two 
observable values, say 0 and 1, and one wishes to use this dichotomized sample to estimate, 
or to test hypotheses concerning, the correlation coefficient p of the original bivariate 
normal distribution. The problem of biserial correlation occurs frequently in psychological 
work, especially in test construction and validation. 

The term biserial correlation was introduced by Karl Pearson (1909), who was the first 
to perceive the statistical importance of this particular type of problem. He proposed as 
an estimator the sample biserial correlation coefficient. The asymptotic variance of this 
estimator was derived by Soper (1913). Much literature exists on the subject of how best 
to compute Pearson’s coefficient. In this connexion the reader should see Du Bois (1942), 
Dunlap (1936) and Royer (1941). 

Prof. Harold Hotelling realized some years ago that the existing methods for dealing 
with the problem of biserial correlation were far from satisfactory, and suggested to the 
author that the whole situation be reconsidered. The results of this examination are con- 
tained in the present paper. 

§ 2 contains a list of most of the notation which has been adopted, and § 3 deals with the 
mathematical model. In §4 the question of maximum likelihood is treated. Asymptotic 
variances are derived for the estimators © and p. The asymptotic variance for p is compared 
with the approximate expression arrived at by Maritz (1953) when he considered a some- 
what restricted model. Both expressions are shown to achieve their minimum value at 
«w = 0 when p is fixed. 

Matters concerning asymptotic normality and asymptotic efficiency are also considered. 

An appraisal of r*, the sample biserial correlation coefficient, is given in detail in § 5. 
It is shown to have asymptotic efficiency for estimating p which is 1 when p = 0. but which 
approaches 0 when | p| approaches 1. The well-known fact that r* may be greater than | is 
pointed out and some notion of the magnitude of r* is obtained by a consideration of the 
product-moment correlation coefficient r. Asymptotic normality of r* is verified by the use 
of a theorem of Cramér. The asymptotic standard deviation is tabulated at the end of the 
paper (Table 2). A proof is given for the customarily assumed fact that the asymptotic 
variance has a minimum for fixed p when w = 0. For the case w = 0 an approximate variance 
stabilizing transformation is derived. Calculations pertaining to this transformation may 


+ Part of this research was done under an Office of Naval Research contract at the Institute of 
Statistics, University of North Carolina. The balance was sponsored by the Office of Naval Research on 


the Navy Theoretical Statistics Project at the Laboratory of Statistical Research, University of 
Washington. 
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be carried out by using a table (see Fisher, 1946, Table V B) for the function tanh—'r. This 
result should prove useful in many situations. 

§ 6 is devoted to a discussion of an iterative method of solution for the likelihood equations. 
The method is essentially Newton’s method for two variables, the calculated values w*, 
r* being used to start the iteration. The computations are not really prohibitive, considering 
the importance of the problem, and are to a certain extent organizable for punched-cards 
methods. An example is given with all of the calculations illustrated. Values of ¢(x), the 
reciprocal of Mills’s ratio, are required for the solution of the likelihood equations. These 
may be obtained from the tables published as a separate contribution, immediately following 
the present paper. 

Two matters of some importance which are not considered in the present paper are: 


(1) An investigation of the bias of (. 
(2) The numerical tabulation of the asymptotic variances of @ and /. 


Further study is indicated on at least the second point. 


2. NoTAaTION 


To eliminate the distraction of searching through the text, we shall list here most of the 
symbols and notational devices used: 


1 


(x,y) = aa exp |- xi —p?) (x? — 2pxry + | , the bivariate normal density. 


A(z) = Tene the normal density.t 
+00 
pie) =| Al)dt, (2) = 1-ple)-t 
(x) = nat the reciprocal of Mills’s ratio. 


+0 
E(z,0) = | (2,4) dy. 


y(x,w) = [" venan, 


xX the undichotomized normal random variable. 

Y the dichotomized normal random variable. 

w the point of dichotomy of Y, measured in standard units. 

Z the discrete random variable induced by the dichotomization of Y. 
f(x, 2) the joint density of the random variables X and Z. 

- the sample biserial correlation coefficient. 


AV(r*) the asymptotic variance of r*. 
AEff(r*) the asymptotic efficiency of r* for estimating p. 
MN (u,07) anormal random variable with mean y# and variance oc”. 


+ [Editorial Note. To bring this notation into conformity with that of the tables printed on pp. 217— 
221 below and with that used in the recently published Biometrika Tables for Statisticians, vol. 1, it is 
necessary to write Z = A(x), Q = p(x), P = 1—p(x), so that ¢(x) = Z/Q or Z/P according as z is 
>0or <0.] 
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3. MATHEMATICAL MODEL 


Let (X, Y) have the bivariate normal distributon y¥{(x—,)/o,(y—v)/7}. Let Z be a dicho- 
tomy of Y, with the point of dichotomy w measured in standard units. Without losing any 
generality we may set vy = 0 and7 = |. Zis thus a random variable which takes the value 1 
when Y >w and the value 0 when Y <w. Obviously, 


+a 
P(Z=1) = | Aly) dy = p(w), P(Z=0) = qw). (3*1) 


w 


Consider a sample of n independent random vectors (X,,Z,), (X»,Z,), ..., (X,,Z,). The 
problem of biserial correlation consists of finding a suitable function of (X,, Z,) (¢ = 1,2, ...,2) 
with which to estimate p. 
Karl Pearson (1909) introduced the estimator r* (‘biserial r’), which we express in the 
following form:t l 4 fi i 
xi X) (2; — 4) fa x(Z,—Z)? 


=f MT) ; (3-2) 


r* = 





|" (X,—X)? A(T) 





where r is the product-moment correlation coefficient of (X;,Z,;), and 7’ is the solution of 
the equationt +0 a 
. Ay) dy = Z. (3-3) 


r* will be discussed completely in § 5. For the present we shall merely state the asymptotic 
variance obtained by Soper ace 
wy a "oas pote? 2P— 0 _5) , vg | 
A V(r) = 5 lots pt Pag 4 PO a (34) 
where the functions p, g and A all have argument w. ./{Av(r*)} is given in Table 2 at the end 
of the paper. In view of symmetry about the values p = 0 and p(w) = 3, the tabulation is 
given for p = 0 to 1 in steps of 0-10, and for p = 0-05 to 0-50 in steps of 0-05. 
Since the random variable Z takes the value 0 or 1, the joint density of (X,Z) can be 


written f(x, z) = af (x, 1)+ (1 —2z) f(a, 0), (3-5) 


7 ww +0 
where f(x, 0) = Y(a,y)dy, f(x, 1) -| Y'(x, y) dy, (3-6) 


with Y(z,y) denoting the bivariate normal density, y{(x—,)/o,y}, with means uw and 0, 
variances o? and 1, and correlation p. §§ 4 and 6 are devoted to a discussion of the likelihood 
function IIf(x;, z;). 


4. PROPERTIES OF THE MAXIMUM-LIKELIHOOD ESTIMATORS 


As the likelihood function stands it may be expressed as 


|, ¢(%—# (=f )| 
2 le £ z.)n{— e : 
L(y, 07,0, p) = Ul a ter ,) +(1—2z;)9 | ,o (4-1) 


Maritz (1953) considered the restricted model with ~ = 0, o? = 1. Using biserial data 
(X,,Z;) (i = 1,2,...,”), he introduces a grouping of the X observations, and then considers 
+ All = and II symbols with index ¢ will have limits 1 to n. 


t The reason for this definition of 7' will be apparent in § 5, when we show that r* is consistent and 
asymptotically normal. 
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the observations to be concentrated at their respective cell mid-points. This leads to a neat 
solution of the problem by probit-analysis methods. A proof of the convergence of this 
method as the grouping becomes finer must depend on a close examination of the limiting 
processes involved. Specifically, it is necessary that as the cell width becomes small, and 
the sample size large, each cell must contain sufficiently many observations that the ratio 
of the number of X observations whose corresponding Z observations are 1 to the number 
of X observations provides a valid approximation to the conditional probability that Z = 1. 
Instead of attempting a discussion of this point, we shall derive the asymptotic variances 
for the four-parameter problem (and, as a by-product, for the two-parameter problem), 
and in § 6 discuss an iterative method for obtaining © and /. This method, while more time- 
consuming than that of Maritz, does not require grouping. It should be noted in this con- 
nexion that Tocher’s exact method (see Tocher, 1949, pp. 9-11), also known as the ‘scoring’ 
method, does not help in this case, owing to the difficulty of obtaining expected second 
partial derivatives of L. 

The likelihood situation of Maritz differs from ours because of the fact that when w and 
o? are set equal to 0 and | respectively, the customary method for obtaining asymptotic 
variances of @ and / by an inversion of the information matrix leads to smaller variances. 
As far as the solution for 6 and / is concerned, the results will remain the same after a slight 
transformation.t 

It may be remarked without dwelling at any length on regularity conditions that those 
given by Cramér (1946, p. 500) may be easily verified, since f(x, 0) and f(x, 1) are both in- 
tegrals of bivariate normal densities. Consequently, #, ¢?, @ and f will be asymptotically 
normal and asymptotically efficient estimators of the corresponding parameters. We now 
use the information matrix technique to find the asymptotic variances of @ and /. 


THEorEM I. The asymptotic variances of © and p are 











p?(p?w? + 2) 
w gael “atgde—(["“agdz) " 
J oe 
AV(s) = (1—p*)8 na a — : Ph as a 
[- “adel : x*gdx — (| °29ae) + 
—~ a(e,0,p) = rx) d(T) 6-7 -) 


Proof. Using expression (4-1) for LZ, and letting 6* refer to any of the ten-second order 
partial operators, we obtain the fundamental relation 


E(8*log i.) = ngHg(3* log 9) + npE,(5*10g £), (4-2) 


where H, and #, mean conditional expectation with respect to the conditional densities of 
X given Y <w and Y >w respectively. 


+ The author expresses his indebtedness to the referee for this fact, and for its proof which will be 
given later, 





RosBert F, Tate 209 
For each of the possible operators 6? the calculation of (4-2) proceeds in about the same 
way. We compute, as an illustration, @ log L) 
E ‘ 
Ow? 
It may be shown that 


¥(x| ¥<w) = “nos ), V(x| Y2>o) = 567"... 


oe oC 





After some differentiation we get 


0? log 9 +o, W2(y w) : 
E (“322”) =—n | LEM hcorgy 
mall _oHe—mJe,0}"| 


logé\ _ +p, SPS ow) 
mp5) =—"| epoca) ix. | 


Combining the terms of (4-3) according to (4-2), making use of the relation 


|. ( (* z a) rr a i” : J oe wp) ae, 


(4-3) 





Co 


and performing similar computations for the expectations of the other second partials of 
log L, we arrive at the information matrixt 



































[ _ _% @, ~Pude ___ Pe Lah 7 
1—p? (1—p?)? a(1—p?) a(1—p?) 
_4,—2pua,+p*w*dy a4 P—p*WAy Aap — p?wa, 
(1—p?)* a(1—p?*)? a(1—p*)? 
. 1—p?+pdy pa, : 
symmetric 7 =p) ~ 7 (1—p*) 
_ 2(1—p?) +p?a, 
= x(1 —p*) 
+0 
Bet a, -| akg(c, ,p) dex. (4-4) 
—2o 


The asymptotic variances of @ and /, obtained after inversion, correspond with the expres- 
sions of the theorem. 


The two-parameter problem cf Maritz, solved by considering the upper left 2 x 2 sub- 
matrix, yields asymptotic variances which are the leading terms of the expressions given 
in the theorem; they coincide with his results. 


The role played by w in AV(() is partially described by the following theorem: 

THEOREM II. AV(/) is a minimum for each p when w = 0. 

Proof. For the case p = 0 the proof follows from Theorem IV of the next section . For the 
case p +0, make the transformation (w— x) (1—?)-* = yin the integrand of (4-4). AV(p) 
can then be expressed as ——— 

Avi =O -P V1 4 | 


Tb 8Fb 9)’ 


where b, = E’[X*¢(X) ¢( — X)], with H’ denoting expectation with respect to the distribu- 
tion of W[w(1 —p*)-*, p?(1 —p?)-*]. It can now be seen from considerations of symmetry 





+ Columns (from left to right) and rows (from top to bottom) correspond to w, p, ft, o. 


14 Biom. 42 
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that b, and 6, have maxima at w = 0, while 6, has its minimum at w = 0, which completes 
the proof. 


5. THE UTILITY OF 7* 


We now present a series of results concerning r*, which will be followed by a general dis- 
cussion of its value. Note that expression (3-2) for r* is invariant under a linear transforma- 
tion of the X;, so in all results pertaining only to r*, we set w = 0 and o? = I. 

A little later we shall need H(XZ). Since it is not difficult to obtain, we will give the 
expression for the general moment a, = H(X*Z). 


k * +00 
j=0 \J w 
where a; is the j-th moment of the random variable N (0, 1). 
Proof. Using the definition of E,, we obtain 


E(X*Z) = pE,(X*) = | 1s | Tabla, y)dyde. 


Make the transformation ¢ = (2 —py)./(1—p?). The above then reduces to 


+0 fP+o Wy a - 
es VP) + IPH Tom Tam) 


Using a binomial expansion and integrating with respect to ¢, we obtain Theorem ITI. The 
integrals which occur are incomplete gamma functions which may be evaluated by the 
usual recursion relation. 

For completeness we include the relation between p(X, Y) and p(X, Z), known to Karl 
Pearson: 





e—** dydt. 


Ba 
(pq) 
It follows from the original definition of biserial correlation, as given by Pearson, that r* 


is consistent. This fact is also an immediate consequence of relation (3-2) between r* and r: 
r—> p(X, Z) in probability as n > oo. Thus 


p(X, Z) = p(X, Y) 


+. (lyiz _zy\*_, A(X, 2) (09) 
n 1 


r* = NT | | Aw) in probability, 


and hence by the above, r* + p(X, Y) in probability. 

With respect to the magnitude of r*, it is well known that | r*| can be greater than 1. 
Something of the nature of this phenomenon can be understood by looking at r. In order 
to prove a result concerning the magnitude of r*, we shall need a preliminary result (see 
Tate, 1953, Lemma 2): 


THEOREM IV. p(x) q(x) > 40 A*(x), (—c00< 24< +00), with equality at 0, + 00. 
Now we have 


THEOREM V. r* , 
> 2 Nit). 


Proof. Rewriting (3-2) as 


+. (Z-Z) 
A(T) ’ 


r- =f 





728 
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we have, in view of the definition of 7’, 


_ viv?) Py} 
X(T) ~ 


r® 


Theorem IV applies for any 7’, so Theorem V is proved. As a consequence of Theorem V, 


we see that 
i> +1 
» 


, 2 
' according as r> +f 


da 
<= he 
7 


Asymptotic normality of r*, which will be needed later in this section, is a consequence 
of a theorem of Crameér. 


THEOREM VI. r*~N{p, AV(r*)}. 


Proof. In expression (3-2) the term A(7’) is seen to be an infinitely differentiable function 
of Z. Thus, r* is a totally differentiable function of the sample means X,Z, X?, XZ. Applying 
Cramér’s theorem (see Cramér, 1946, p. 366), we have asymptotic normality with the 
asymptotic variance (3-4) calculated by Soper. 

We shall now present two results which are more important than those just preceding. 
They concern the asymptotic, or large-sample, efficiency of r*, with respect to the class of 
estimators of p based on the sample (X,, Z;). 


THEOREM VII. r* is an asymptotically most efficient estimator of p when p = 0. 


Proof. In view of Theorem VI on asymptotic normality, we have a right to inquire about 
the asymptotic efficiency of r*, which will be denoted by 


AV(p) 
+) 
AEff(r*) = [Vir*)" 
It may be seen from Theorem I that 
“ — P(w) qv) ; 
AV(p |, 0) = nfA(wo)}? (5-1) 


Now, from (3-4) we observe that (5-1) coincides with A V(r* | w,0). The conclusion follows 
from the definition of an asymptotically most efficient estimator. 


THEOREM VIII. r* is an asymptotically least efficient estimator of p when | p|—> 1. 
Proof. An application of Theorem IV shows that 


(stro) #(-saos) <3 


Hence, recalling the definition of g(x, #, p) in Theorem I, we see that all integrals of the form 


r+ 
| xkg(x,w,p)dx exist. Schwarz’s inequality shows that A V(( | w,p) is such that the term 


in braces is non-vanishing. Thus, A V( | w,p)—> 0 as |p|—> 1. From the fact A V(r* | w,p) >— 


as |p| 1, we conclude that AEff(r* |w.p)>0. - 
The special case w = 0 has interesting features which will appear in Theorems X and XI. 
First we shall need another preliminary result (see Tate, 1953, Lemma 1): 


THEOREM IX. {1—2p(x)}A(x)—axp(x)q(z)>0, (x20). 
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THEOREM X. The asymptotic variance of r* has its minimum for each p at w = 0. 
Proof. In view of symmetry, it will be sufficient to show the result for w>0. Let 





_ etplw) qv) of1—2p(0)} = plea) ate) 
A@)= Hoy ~~ awe) °° BO) = AeyF 


g(w) = {1—2p(w)} A(w) —wp(w) gw), h(w) = p(w) q(w) — m{A(w)}?/2. 


From this point until the end of the proof, we shall omit w whenever it appears as an argu- 
ment of any function. We ! ave (Tate, 1953) 





g = 2° — pz, = --40F+(2qg—I1)A, g(0)=g(o)=90, g'(0)>9, 
Ah’ =A(1--%, wA®, kh” = A(m—2—- 2m?) —w(1—29)A, 


h(0) = h(co) = h'(0) = 0, h”(0)>0. 
Accordingly, we have 


A=-ogi?. B=hd-*+4n, with A<0, B24, 


both equalities holding at w = 0. The relation AV(r* | w,p) > AV(r* | 0, p) for all p may be 
written p?A + B> 4m for all p. Sinve A <0 this last expression is implied by A+ B>4n, 
which in turn is equivalent to h > wg. Thus, we must show k = h—wg>0: 


k’ = 2wq(1—q) — 2(2q—1)A+ (4-2) A2, 
k" = 2q(1—q) —AX6— 7 + w(2n — 4)}, (52) | 
k(0) = k(00) = k’(0) = 0, &"(0) = 1—-3/m>0. 
We shall show that there exists no y such that k’(y) = 0, k(y) <0. Suppose such a y does 


exist. Then 2(2q —1)A = 2yq(1—q) + (a7 —2) yA?2, | we 
q(1—q) (1+ y2) — $mA2—y(2q—1)A<0.] 
Substituting the right member of the first expression into the second, we have 
2q(1—q) < A*%{a + (7 — 2) y*}. 


Thus, k"(y) < A*{27 — 6 -- (7— 2) y®}. A negative maximum must, from (5-2), be followed by } 
a negative minimum. Hence, from the above relation in k”(y), there exist no extrema which 
exceed {(27 — 6)/(7— 2)}*. Assuming there is a negative extremum of k, then there must be 
a negative minimum in (0,1). Let y be this minimum point. Then k"(y) > 0, or from (5-2), 
2q(1—q) —A*{6—7— (27 — 4) y*} > 0. Substituting the value of 2q(1—q) obtained from the 
first equation in (5-3), we reach (2q—1)—yA{2+(m— 2) y?}>0. The left member vanishes 
at y = Oand has a negative derivative for 0 < y* < 1. Therefore, there is no negative minimum 
in (0,1), and from the previous argument k > 0, which completes the proof. 

Since for any fixed p, r* is a better estimator when w = 0, it will be useful to have for this \ 
case something simpler in the way of an asymptotic distribution of r* than that contained 
in Theorem VI. We are therefore led to 


THFOREM XI. When w = 0, we have to a close approximation 


2r 


2p 5 


* 
a a 
tanh 5 i (tanh JB’ i): 


~~ 


v 





i- 


) 
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Proof. 





} 1 
AV(r* | 0,p) = —(p—$ p+ 4m) = “(8 —pt)?—( 
Dropping the last term and solving the equation 
I es 
WN) = Gay 
we have g(x) = (2/,/5) tanh— (2z/,/5). It is known that 


Jnig(r*) —9(p)} ~ 4 (0, 1), 
so the theorem is proved. 


Discussion of results concerning r* 


In looking over Theorems V, VI, VIi, VIII, X and XI, several facts stand out. First, even 
though r* is consistent and asymptotically normal, it is still inadequate for estimating p 
because of its possible magnitude and its lack of large-sample efficiency for large values of 
|p|. In the case of testing the hypothesis H: p = py the first defect is not of so much con- 
sequence. Even in a problem of estimation, one can always operate under the rule: when 
|r* | <1, estimate p by r*; when r* > 1, estimate p = 1; and when r* < — 1, estimate p = — 1. 
The gross defect is lack of efficiency. In practically all applications it is of more interest to 
detect large values of p than small values. In just such cases 7* is a ‘worst’ estimator. On 
the other hand, again speaking in large-sample terms, when p = 0, r* is a ‘best’ estimator.7 
Hence, if we base a test of H: p = py) on r*, good results should be achieved when | pg | is 
small. It is then recommended that 7* be used for one and only one purpose, to test H: p = py 
when || is small. If, in addition, the assumption w = 0 is tenable, then the variance 
stabilizing transformation of Theorem XI may be used, calculations being performed with 
Table VB of Fisher (1946, p. 210). In such a case certain advantages (see Fisher 1946, 
pp. 197-204) will accrue. {nAV(r*)}* is given in Table 2. Note that from Theorem X, 
w = 0 is desirable on the grounds of precision. 


6. SOLUTION OF THE LIKELIHOOD EQUATIONS 


Using the notation of (4-1), we have the likelihood equations 








= 0. (6-1) 








As was mentioned previously (cf. footnote, p. 208), the solution of the four-parameter 
problem reduces to the problem of determining 6 and /. It turns out that the likelihood 
equations for 1 and w may be combined to yield # = %. Similarly, a combination of likelihood 
equations for ¢? and p will give us 6? = s*. Details will be omitted. 

We now replace uw and o by & and s in the expression (x;—,)/o, which occurs in (6-1), 
and denote the result by x;. Also, let L’ denote the new likelihood function. The solution of 
6L’ = 0 for 6 and follows. 


+ Another point in favour of r* is indicated by the parallelism of Theorems II and X. 
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One may easily verify that 











Omi, co) _ E (a, w) : 
= ¥(x xj, W), ba a ghey Yh (x; Xj, 
~ Ow ow | (6-2) 
am(ai,w) _ =(xi—pu)y(ai.w) Alejo) _ (a — peo) Wa) | 
dp (1—p*) op (1—p?) 


By the use of (6-2) the equations dL’ = 0 can be written as 





Ble; — pw) (22,— 1) | (22) —1) (=P) — | 





1— 
, . “ie (6-3) 
E(22,—1 22,—1)(-7-Mt)|—-o | 
Be NOVA — MT aay | 
Now introduce the notation 
6; = 2z,-1, te x;)(1—p?)-*, o; = 6(6;7;), 

px;)(l—p?)+, ¢ ae (6-4) 

= $(9;—9;7;). 

Rewriting (6-3) again, in the new notation, we have 
26;¢; = 0, Ld;x¢; = 0. (6-5) 


Easy differentiation gives ¢’(z) = $(x) {d(x) — x}. Newton’s method in two variables gives 
the following equations in Aw and Ap, where Aw = w—w,, Ap = p—/p,, w, and p, being 
initial guesses: 


ito BS PO, 2A, — ZA; 2; ) 
rea all ee Oe a 


(2 ane (OAL E A a 





(6-6) 





ier) 80 = 2a 


Let A be the determinant of the coefficients. The method of solution will then be the following : 
(i) Compute w*, r* from the sample (x,,z;)(¢ = 1, 2, ...,.m), where r* is the sample biserial 
correlation coefficient and w* is the solution of the ipa ical p(w) =z. Now, let w, = w* and 
r* when |r*| <1, 
P1 =++090 when = 7* 31, 
—0-90 when r* <-l. 

(ii) Compute 4;, y;, d;, 6;9;, 6;x;9,;, A;, A;xj, A;x}? for i = 1,2,....n, where 4;, y;, $;, A; 
are defined in (6-4), and the numerical \aheen of the %; = $(6;y;) are obisined from the tables 
printed on pp. 217-221 below. Note that these tables must be entered with X = 6,7,, while 
d; = Z/Q if X>0 and Z/P if X <0. 

(iii) Evaluate the three determinants 





pn| B40 pmEA- EA | 
LA;% = pW, XA; 4, -— DA; x}? |(1—p?)?’ 

Diente — 26; 9; P\%,2A;—-ZAe, | 1 
“Sarid proDAval—EAyal [A =pHe 

ap-| 24 - Bhd pes ems 
A,e, = — 2d, x9; |A(1 — pit 








3) 


5) 
es 
ng 


6) 


id 


le 
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(iv) Obtain (w,) from (Aw, Ap) and (w,,/p,), and repeat the process using w = W., p = Pp, 
in place of w, and p,. 

The rule given in (i) is somewhat arbitrary, but is believed to be a good rule of thumb. 
The longest stage in the scheme outlined above is the determination of ¢,, i = 1,2,...,n, 
from the tables. 

We shall now present an illustration of the method. In order to have a good vantage point 
for observing the way the calculations run, we select a random sample from a bivariate 
normal population with zero means, unit variances and correlation 1/,/2. A table of random 
numbers from such a population is not available directlyt, but can be constructed from a 
table of random numbers from (0,1) as follows: Let U = (0,1), V = M(0,1), w = 4, 
with U and V independent. Now let X = U and Y = (U+V)/,/2. After introducing the 


dichotomy Z=1if Y>4, Z=0 if Y<}, (6-7) 


we have the desired set-up. It is important to note that for this particular example, the 
x, of the computing scheme become merely ~;, since it is known that ~ = 0 and o? = 1. For 
computations see Table 1 below. 

A second iteration resulted in w, = 0-251, p, = 0-489. Since 6 remained unchanged in 
the third decimal, the results were not included. Recall that the true value of f is 0-707. 
On the basis of our sample of 20, 6 = 0-489 is the best we can do. However, by using the 
iterative scheme instead of r* we removed 27 % of the error. 

Although r* is used merely to start the iteration, the fact that 


| r*— B(r*) || y{AV(r*)} = 1-64 


serves to indicate that we could have been more fortunate in our selection of a sample. 


REFERENCES 


Cramér, H. (1946). Mathematical Methods of Statistics. Princeton: Princeton University Press. 

Du Bors, P. H. (1942). A note on the computation of biserial r in item validation. Psychometrika, 
7, 143. 

Duntap, J. W. (1936). A nomograph for computing biserial r. Psychometrika, 1, 59. 

Fisuer, R. A. (1946). Statistical Methods for Research Workers. Edinburgh: Oliver and Boyd. 

MariTz, J. S. (1953). Estimation of the correlation coefficient of a bivariate normal population where 
one of the variables is dichotomized. Psychometrika, 18, 97. 

Pearson, K. (1909). On a new method for determining the correlation between a measured character 
A and a character B. Biometrika, 7, 96. 

Rover, E. B. (1941). Punched cards methods for determining biserial coefficients. Psychometrika, 
6, 55. 

Sorsr, H. E. (1913). On the probable error for the biserial expression for the correlation coefficient. 
Biometrika, 10, 384. 

Tarr, R. F. (1953). On a double inequality of the normal distribution. Ann. Math. Statist. 24, 133. 

Tocuer, K. D. (1949). A note on the analysis of grouped probit data. Biometrika, 36, 9. 


+ [A table of Correlated Random Normal Deviates is now at press and will be issued shortly from the 
Department of Statistics, University College, London as Tracts for Computers No. xxvi, Ep.] 








Table 1. Computation of example 


Correlation between two continuous variables when one is dichotomized 















































2% Vi 9: 6,249; $:—-9:75 A,x, A,x 
0-24 | 0 0-030 0-779 0-187 | 0-809 0-151 | 0-036 
0-63 1 — 0-146 | 0-707 0-445 | 0-853 0-380 | 0-239 
—0-59 | 0 0-404 0-560 —0-330 | 0-964 —0-319 | 0-188 
108 | 0 — 0-348 1-032 1-115 | 0-684 0-762 | 0-822 
0-06 | 0 0-111 0-729 0-044 | 0-840 0-027 | 0-002 
-0-01 | 0 0-143 | 0-709 —0-007 | 0-852 —0-006 | 0-000 
1-59 1 — 0-578 0-470 0-747 | 1-048 0-784 1-247 
—0-41 | 0 0-323 | 0-604 | 0-248 | 0-927 —0-230 | 0-094 
— 0-33 1 0-287 | 0-989 | —0-326 | 0-702 —0-229 | 0-076 
0-99 1 — 0-308 | 0-613 | 0-607 | 0-921 0-559 | 0-553 
0-30 | 0 0-003 0-796 0-239 | 0-799 0-191 0-057 
—2-:07 | 0 1-070 0-262 —0-542 | 1-332 — 0-722 1-495 
—0-21 0 0-233 0-655 —0-138 | 0-888 —0-122 | 0-026 
— 0-47 1 0-350 1-033 0-486 | 0-683 —0-332 | 0-156 
1-28 1 — 0-438 0-541 | 0-692 | 0-979 0-678 | 0-868 
0-82 1 — 0-231 0-656 | 0538 | 0-887 0-477 | 0-391 
1-06 1 — 0-339 0595 | 0-631 | 0-934 0-589 | 0-624 
1:09 | 0 — 0-353 1-035 | 1:128 | 0-682 0-770 | 0-829 
0-57 | 0 —0-119 0-875 0-499 | 0-756 0-377 | 0-215 
~053 | 0 0-377 | 1-052. | -—0-558 | 0-675 —0-376 | 0-199 
| 
<a, = 5-09 = 5-04 r*=+0-410(p,) = -y, = 0-138 —0-4502, 
Xx? = 15-626 9 w* = +0°126 (a) 

xd,¢, = — 1-380 11-865 LA,2? = 8-127 

xd, 2,9; — 0-343 — 3-409 Aw — 0-108 

W, = 0-234 Pz = 0-489 


The asymptotic standard deviation of r* (biserial r) as a function of p and p (equation (3-4)) 
All values must be divided by ,/n, as the quantity tabled is {nA V(r*)}4 














porl—p 
] 
0-05 0-10 0-20 0-25 0-30 0-35 0-45 0-50 
4-466 | 2-922 2-041 1-857 1-737 1-658 1-580 1-571 
2-104 1-699 1-419 1-353 1-308 1-278 1-247 1-243 
2-077 | 1-668 1-389 1-323 1-279 1-248 1-217 1-213 
2-033 1-616 1-339 1-273 1-229 1-198 1/167 1-163 
1-971 1-543 1-269 1-203 1-159 1-128 1-097 1-093 
1-893 1-449 1-179 1-114 1-069 1-038 1-008 1-004 
1-799 1-333 1-069 1-004 | 0-960 | 0-930 0-898 | 0-894 
1-691 1-194 0-939 | 0-875 | 0-831 0-801 0-769 | 0-766 
1-569 1-031 0-789 | 0-727 | 0-683 | 0-653 0-620 | 0-616 
1-438 | 0-842 0-619 | 0-559 | 0-517 0-486 0-453 | 0-449 
1-302 | 0-616 0-429 | 0-374 | 0-335 | 0-304 0-270 | 0-266 
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THE NORMAL PROBABILITY FUNCTION: TABLES OF CERTAIN 
AREA-ORDINATE RATIOS AND OF THEIR RECIPROCALS 


EDITORIAL 
Notation. Writing X as a standardized normal deviate, the following notation will be used:* 
1 
Z=2(X) =——e-**", 
0 = Jen 


P= PX rs . 
es J —« ¥(27) 

The table then gives for X = 0-00(0-01) 3-00 the values of the four ratios P/Z, Q/Z, Z/P and Z/Q. For 
the last three ratios, the results are extended for argument X = 3-00(0-01) 4-00(0-05) 5-00. The ratios 
P/Z, Q/Z and Z/Q are tabled to five decimal places and those for Z/P to five significant figures. 

Derivation. The ratio P/Z was freshly computed in the Department of Statistics, University College, 
London, using the Tables of Normal Probability Functions, National Bureau of Standards (1953). 
The ratio Q/Z has been taken directly from Table III of Karl Pearson’s Tables for Statisticians and 
Biometricians, Part II (1931). The ratios Z/P and Z/Q were calculated by Prof. Z. W. Birnbaum using 
the National Bureau of Standards Tables, and were originally included in two tables forming part of 
the preceding paper by Dr R. F. Tate (1955). With Dr Tate’s and Prof. Birnbaum’s approval it was, 
however, decided to give separately the more comprehensive table which follows. This will form a 
useful companion to Table II, Tables for Statisticians and Biometricians, Part II, which gives the same 
four ratios for an argument of P instead of X. It is intended to reproduce both tables in the second 
volume of Biometrika Tables for Statisticians. 

Having regard to this wider use, Dr Tate’s notation has been modified, so that his ¢(x) (see p. 206 
above) becomes Z/P for «= — X <0 and Z/Q forx=X>0. 

Interpolation. For a large part of the table accuracy to the full number of decimal places given may 
be obtained by linear interpolation. Where this fails, the Bessel formula 


Yo = (1—9) yo + Oy, — $0(1 — A) (d%y9 + d%y,) 


is adequate, except for P/Z with X > 2-50, when full use of second differences in Everett’s formula 
becomes necessary. 


= 3 
—i? =l]-P= —— th ie 
e du, Q=1-P | x J@m) e du. 


The normal probability function ; extension of main table 





x Q/z 2|P ZIQ x | QZ 2|P ZIQ 








4:00 0-23665 | 0-0°13383 | 422561 4:50 | 0-21257 0-0415984 4°70432 


| | 
iS 
| | 
| 





05 | -23401 | -0°10944 | 4:27330 55 | -21042 0412747 4:75240 
10 | = -23143 0489264 | 432103 60 = 20831 -0410141 480051 
15 -22890 -0*72627 | 4-36880 65 | -20624 | -0580472 484865 
-20 22642 058944 | 4-41662 ‘70 | -20421 | -0563698 4-89682 

| 425 0-22399 0-0447719 446447 4:75 0-20222 0-0°50295 | 4-94503 
30 -22161 0438536 4-51237 ‘80 |  -20027 -0539613 | 4-99326 
-35 -21928 0431042 4-56030 85 | -19835 0531122 5-04153 
-40 -21700 0124943 | 4-60827 90 | -19647 0524390 | 5-08983 

| §-13815 


45 -21476 | -0419992 | 4-65628 ‘95 = -19462 -0519066 








| 
| 4-50 0-21257 0-0415984 4°70432 5-00 | 0-19281 0-0514867 5-18650 
| | 


* This conforms to the notation used by Pearson and Hartley in /‘iometrika Tables for Statisticians, 
1 (1954). 























Normai probability function 


The normal probability function. Table of area/ordinate ratios and their reciprocals 







































































x P/Z Q/Z Z/P Z/Q x. P/Z Q/Z 2Z|P | Z/Q 
0-00 | 1-25331 | 1-25331 | 0-79788 | 0-79788 | 0-50 | 1-96402 | 0-87636 | 0-50916 | 1-14108 
-01 | 1-26338 | 1-24338 | -79153 -80426| -51 | 198399 | -87078 | -50404 | 1-14840 
-02 | 1-27357 | 1-23356 | -78520  -81066 | -52 | 2.00426 | -86525 | -49894 | 1-15574 
03 | 1-28389 | 1-22387 | -77888  -81708 | -53 | 2-02483 | -85977 | -49387 | 1-16310 
04 | 1-29434 | 1-21430 | -77260 | -82352 | -54 204572 | -85436 | 48882 | 1-17047 
0-05 | 1-30492 | 1-20484 | 0-76633 | 0-82999 | 0-55 | 2-06693 | 0-84900 | 0-48381 | 1-17786 
06 | 1-31564 | 119550 | -76008 | -83647 | -56 | 2-08846 | -84370 | -47882 | 1-18526 
‘07 | 1-32650 | 1-18627 | -75386 | 84298 | -57 | 2-11032 | -83845 | -47386 | 1-19268 
08 | 1-33750 | 1-17716 | -74766 | 84950 | -58 | 2-13252 | -83326 | -46893 | 1-20011 
09 | 134864 | 1-16816 | -74149 | -85605 | -59 | 2-15506 | ‘82812 | -46402 | 1-20756 
0-10 | 1-35993 | 1-15926 | 0-73533 | 0-86262 | 0-60 | 2-17795 | 0-82303 | 0-45915 | 1-21503 
-11 | 1-37136 | 1-15048 | -72920 | -86921 | -61 | 2-20120 | -81799 | -45430 | 1-22251 
-12 | 1-38295 | 1:14179 | -72309 | -87582 | -62 | 2-22481 | -81301 | -44948 | 1-23000 
| 13 | 1-39468 | 113321 | -71701 | -88244 | -63 | 2-24879 -80807 -44468 | 1-23751 
‘14 | 1-40658 | 1-12474 | -71095 | -88910] -64 | 2-27315 | -80319 -43992 | 1-24504 
| 0-15 | 1-41862 | 1-11636 | 0-70491 | 0-89577 | 0-65 | 2-29789 | 0-79835 | 0-43518 | 1-25258 
| -16 | 1-43083 | 1-10809 | 69889 | -90246 | -66 | 232302 | -79357 | -43047 | 1-26013 
| +17 | 1-44320 | 1-09991 | 69290 | -90917 | -67 | 234855 | -78883 | -42579 | 1-26770 
| *18 | 1-45574 | 109183 | -68694 | -91590 68 | 237449 | -78414 | -42114 | 1-27529 
‘19 | 1-46844 | 1-08384 | -68099 | -92265 | -69 | 240085 | -77949 | -41652 | 1-28289 
10-20 | 1-48132 | 1-07594 | 0-67507 | 0-92942 | 0-70 | 2-42763 | 0-77489 | 0-41192 | 1-29050 
‘21 | 149437 | 1-06814 | -66918 | -93620 | -71 | 245484 | -77034 | -40736 | 1-29813 
‘22 | 150760 | 106043 | -66331 | -94301 | -72 | 2.48249 | -76583 | -40282 | 1-30577 
| ‘23 152101 | 1-05281 | -65746 | -94984 | -73 | 2-51059 | -76137 -39831 | 1-31342 
| 24 | 153460 | 1-04527 | *65164 | -95669 | -74 | 2-53915 | -75695 | -39383 | 1-32109 
| | | | | 
| 0-25 | 1-54837 | 1-03782 | 0-64584 | 0-96355 | 0-75 | 256817 | 0-75257 | 0-38938 | 1-32878 
| +26 | 1-56234 | 1-03046 | -64007 | -97044 | -76 | 2-59767 | -74824 | -38496 | 1-33648 
-27 | 1-57650 | 1-02318 | -63432 | -97734 | -77 | 2-62766 | -74394 | -38057 | 1-34419 
‘28 | 1-59085 | 1-01599 | -62859 | -98426 | -78 | 2-65814 | -73969 | -37620 | 1-35191 
| 29 | 1-60541 | 1-00887 Coe -99121 | -79 | 2-68913 | -73548 | -37187 | 1-35965 
0-30 | 1-62017 | 1-00184 | 0-61722 | 0-99817 | 0-80 | 2-72063 | 0-73131 | 0-36756 | 1-36740 
‘31 | 1-63513 | 0-99488 | -61157 | 1-00514 | -81 | 2-75266 | -72718 | -36328 | 1-37517 
‘32 | 1-65030 | -98801 | -60595 | 1-01214 | -82 | 2-78523 | -72309 | -35904 | 138295 
33 | 1-66569 | -98121 | -60035 | 1-01916 | -83 | 2-81835 | -71904 | -35482 | 1-39074 
-34 | 1-68130 | -97448 | -59478 | 102619 | -84 | 285202 | -71503 | -35063 | 1:39854 
0-35 | 1-69713 | 0-96783 | 0-58923 | 1-03324 | 0-85 | 2-88626 | 0-71106 | 0-34647 | 1-40026 
36 | 1-71318 | -96126 | -58371 1-04031 | -86 | 292109 | -70712 | -34234 | 1-41419 
-37.| 1-72946 | -95475 | -57821 | 1-:04739 | -87 | 2-95651 | -70322 | -33824 | 1-42204 
| +38 | 1-74598 | -94832 | -57274 | 1-05450 | -88 | 2-99254 | -69935 | -33416 | 1-42989 
| -39 | 176273 | -94196 | -56730 | 1-06162 | -89 | 3-02918 | -69553 | -33012 | 1-43776 
0-40 | 1-77973 | 0-93567 | 0-56188 | 1-06876 | 0-90 | 3-06646 | 0-69173 | 0-32611 | 1-44564 
-41 | 1-79697 | -92944 | -55649 | 1-07591 | -91 | 3-10438 | -68798 | -32212 | 1-45354 
| +42 | 1-81447 | -92329 | -55113 | 1-08308 | -92 | 3-14296 | -68425 | -31817 | 1-46144 
-43 | 1-83222 | -91720 | -54579 | 1-:09028 | -93 | 3-18222 | -68057 | -31425 | 1-46936 
-44 | 1-85023 | -91118 | -54047 | 1-09748 | -94 | 3-22216 | -67691 | -31035 | 1-47729 
0-45 | 1-86850 | 0-90522 | 0-53519 | 1-10471 | 0-95 | 3-26280 | 0-67329 | 0-30648 | 1-48524 
-46 | 1-88704 | -89932 | -52993 | 1:11195 | -96 | 3-30416 | -66971 | -30265 | 1-49319 
-47 | 1-90586 | -89349 | -52470 | 1-11921 | -97 | 3-34624 | -66615 | -29884 | 1-50116 
-48 | 1-92496 | -88772 | -51949 | 1-12648 | -98 | 3-38908 | -66263 | -29506 | 1-50914 
49 | 194434 | -88201 51431 | 1-13377 | -99 | 3-43268 | -65914 | -29132 | 1-51713 
0-50 | 1-96402 | 0-87636 | 0-50016 | 1-14108 | 1-00 | 3-47705 | 0-65568 | 0-28760 | 1-52514 
xXx 
2(X) = e-*"/ (27), P(X) =1-—Q(X)= i) Z(u) du. 
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EDITORIAL 219 
x P/Z Q/Z Z|P Z/Q x P/Z Q/Z Z/P ZIQ 
1-00 | 3-47705 | 0-65568 | 0-28760 | 1-52514 | 1-50 | 7-20514 | 0-51582 | 0-13879 | 1-93868 
‘O1 | 3-52222 | -65225 | -28391 | 1-53315 | -51 | 7-32448 | -51356| -13653 | 1-94719 
‘02 | 3-56821 | -64885 | -28025 | 154118 | -52| 7-44636| -51133| -13429 | 1-95570 
‘03 | 3-61502 | -64549 | -27662 | 1-54922 | -53| 7-57087| -50911 | -13208 | 1-96423 
‘04 | 3-66268 | -64215 | -27302 | 155726 | -54| 7-69805| -50690| -12990 | 1-97276 
1-05 | 3-71121 | 0-63885 | 0-26945 | 1-56532 | 1-55 | 7-82799 | 0-50472 | 0-12775 | 1-98130 
‘06 | 3-76062 | -63557 | -26591 | 1-57340 | -56| 7-96074] -50255| -12562 | 1-98985 
07 | 381094 | -63232 | -26240 | 1-58148 | -57| 8-09639| -50040| -12351 | 1-99841 
‘08 | 3-86218 | -62910 | -25892 | 1-58958 | -58| 8-23500| -49826| -12143 | 2-00698 
‘09 | 391437 | -62591 | -25547 | 159768 | -59 | 8-37664 | -49614] -11938 | 2-01555 
1-10 | 3-96752 | 0-62274 | 0-25205 | 1-60580 | 1-60 | 8-52140 | 0-49404 | 0-11735 | 2-02413 
‘11 | 4-02:66 | -61961 | -24865 | 1-61392 | -61 | 8-66935 | -49195| -11535 | 2-03972 
‘12 | 4.07581 | -61650 | -24529 | 1-62206 | -62| 8-82058| -48988| -11337 | 2-04131 
‘13 | 413299 | -61342 | -24196 | 1-63021 | -63 | 8-97517| -48782| -11142 | 2-04992 
‘14 | 4-19)23 | -61036 | -23865 | 1-63837 | -64| 9-13320| -48578| -10949 | 2-05853 
1-15 | 4-24354 | 0-60733 | 0-23538 | 1-64654 | 1-65 | 9-29477 | 0-48376 | 0-10759 | 2-06715 
‘16 | 430795 | -60433 | -23213 | 1-65472 | -66 | 9-45996| -48175 | -10571 | 2-07578 
‘17 | 436849 | -60135 | -22891 | 1-66292 | -67| 9-62887| -47975 | -10385 | 2-08441 
‘18 | 4-42018 | -59840 | -22572 | 1-67112 | -68| 9-80159 | -47777| -10202 | 2-09305 
‘19 | 4.49305 | -59548 | -22257 | 1-67933 | -69| 9-97824 | -47580| -10022 | 2-10170 
1-20 | 456713 | 0-59257 | 0-21944 | 1-68755 | 1-70 | 10-15889 | 0-47385 | 0-098436 | 2-11036 
‘21 | 462243 | -58970 | -21634 | 1-69578 | -71 | 10-34367 | -47192 | -096677 | 2-11902 
‘22 | 464900 | -58684 | -21326 | 1-70403 | -72 | 10-53268 | -46999 | -094943 | 2-19769 
‘23 | 475685 | -58402 | -21022 | 1-71228 | -73 | 10-72604 | -46808 | -093231 | 2-13637 
‘24 | 482603 | -58121 | -20721 | 1-72054 | -74| 10-92384 | -46619 | -091543 | 2-14506 
1-25 | 4-89655 | 0-57843 | 0-20422 | 1-72882 | 1-75 | 11-12622 | 0-46431 | 0-089878 | 2-15375 
‘26 | 4-95845 | -57567 | -20127 | 1-73710 | -76 | 11-33330 | -46244 | -088236 | 2-16245 
‘27 | 5-04176 | -57294 | -19834 | 1-74539 | -77 | 11-54520 | -46058 | -086616 | 2-17115 
-28 | 5-11652 | -57022 | -19544 | 1-75369 | -78 | 11-76205 | -45874 | -085019 | 2-17987 
‘29 | 5-19276 | -56754 | -19258 | 1-76201 | -79 | 11-98397 | -45692 | -083445 | 2-18859 
1-30 | 5-27051 | 0-56487 | 0-18974 | 1-77033 | 1-80 | 12-21112 | 0-45510 | 0-081893 | 2-19731 
‘31 | 534980 | -56222 | -18692 | 1-77866 | -81 | 12-44362 | -45330| -080362 | 2-20605 
‘32 | 543069 | -55960 | -18414 | 1-78700 | -82 | 12-68163 | -45151 | -078854 | 2-21479 
‘33 | 551319 | -55699 | -18138 | 1-79535 | -83 | 12-92528 | -44973 | -077368 | 2-29353 
‘34 | 559735 | -55441 | -17866 | 1-80371 | -84| 13-17474 | -44797 | -075903 | 2-23999 
1-35 | 5-€8321 | 0-55185 | 0-17596 | 1-81208 | 1-85 | 13-43017 | 0-44622 | 0-074459 | 2-24105 
‘36 | 577081 | -54931 | -17329 | 1-82046 | -86 | 13-69171 | -44448 | -073037 | 2-24982 
‘37 | 586019 | -54679 | -17064 | 1-82884] -87 | 13-95955 | -44275 | -071636 | 2-25859 
‘38 | 595139 | -54430 | -16803 | 1-83724 |] -88 | 14-23386 | -44104| -070255 | 2-26737 
39 | 604446 | -54182 | -16544 | 1-84564 | -89 | 14-51481 | -43934 | -068895 | 2-27615 
1-40 | 6-'3944 | 0-53936 | 0-16288 | 1-85406 | 1-90 | 14-80258 | 0-43765 | 0-067556 | 2-28495 
-41 | 6-23638 | -53692 | -16035 | 1-86248 | -91 | 15-09737 | -43597 | -066237 | 2-29375 
‘42 | 6-33533 | -53450 | -15784 | 1-87091 | -92 | 15-39937 | -43430 | -064938 | 2-30255 
-43 | 6-43632 | -53210 | -15537 | 1-87935 | -93 | 15-70877 | -43265 | -063659 | 2-31136 
-44 | 6.53942 | -52972 | -15292 | 1-88780 | -94]| 16-02580 | -43100 | -062399 | 2-32018 
1-45 | 6-54467 | 0-52735 | 0-15050 | 1-89626 | 1-95 | 16-35065 | 0-42937 | 0-061160 | 2-32900 
-46 | 6-75213 | -52501 | -14810 | 1-90473 | -96 | 16-68354 | -42775 | -059939 | 2-33784 
‘47 | 6-36186 | -52268 | -14573 | 1-91320 | -97 | 17-02472 | -42614 | -058738 | 2-34667 
-48 | 6-97389 | -52038 | -14339 | 1-92169 | -98 | 17-37440 | -42454| -057556 | 2-35551 
-49 | 7-08830 | -51809 | -14108 | 1-93018 | -99 | 17-73282 | -42295 | -056393 | 2-36436 
1-50 | 7-20514 | 0-51582 | 0-13879 | 1-93868 | 2-00 | 18-10025 | 0-42137 | 0-055248 | 2-37322 


































































































220 Normal probability function 
The normal probability function. Table of area/ordinate ratios and their reciprocals (cont.) 
x | PZ | Qe | ap | zQ |x | PZ | Qa | ZP | 2jQ 

2-00 | 18-10025 | 0-42137 | 0-055248 | 2.37322 | 2-50 | 56-69633 | 0-35427 | 0-017638 2-82274 
01 | 18-47692 | -41980 | -054122 | 2-38208| -51 | 58-14464| -35313 | -017198 | 2-83186 
02 | 18-86311 | -41825 | -053014 | 2.39094 | -52 | 59-63565 | -35199 | -016768 | 2-84097 
-03 | 19-25908 | -41670| -051924 | 2-39981 | -53 | 61-17075 | -35087  -016348 | 2-85010 
-04'| 19-66512'| -41516 | -050851 | 2-40869 | -54 | 62-75138 | -34975 -015936 | 2-85922 

2-05 | 20-08152 | 0:41364 | 0-049797 | 2-41758 | 2-55 | 64-37902 | 0-34863 | 0-015533 | 2-86835 
06 | 20-50857'| -41212| -048760 | 2.42646] -56 | 66-05523 | -34753 | -015139 | 2-87748 
-07 | 20-94657 | -41062 | -047741 | 2-43536 | -57 | 67-78159 | -34643 | -014753 | 2-88662 
-08 | 21-39586 | -40912 | -046738 | 2-44426| -58 | 69-55976 | -34533 | -014376 | 2-89576 
09 | 21-85675 | -40764 | -045752 | 2-45317 | -59 | 71-39146 | -34425 -014007 | 2-90491 

| } 

2:10 | 22-32959 |. 0-40616 | 0-044784 | 2-46208 | 2-60 | 73-27844 | 0-34316 | 0-013647 | 2-91406 
“11 | 22-81471 | -40470 | -043831 | 2.47100] -61 | 75-22256 -34209 -013294 | 2-92321 
-12 | 23-31249 | -40324 | -042896 | 2.47992] -62 | 77:22570| -34102 | -012949 | 2-93237 
-13.| 23-82329 | -40179 | -041976 | 2-48885 | -63 79-28985  -33996| -012612 | 2-94153 
14 | 24-34749 | -40036 | -041072 | 2-49778 | -64 °° 81-41704. -33890| -012282 | 2-95070 

2:15 | 24-88550 | 0-39893 | 0040184 2-50672 | 2-65 | 83-60939 | 0-33785 | 0-011960 | 2-95987 
‘16 | 25-43771 | -39751 | -039312 2-51566 | -66 | 85-86908 | -33681 | -011646 | 2-96904 
17 | 26-00455 | -39610| -038455 2-52461 |] -67! 88-19839 |! -33577 -011338 | 2-97822 

| +18 | 26-58645 | -39470| -037613 | 2-53357] -68 | 90-59968 | -33474! -011038 | 2-98740 
| ‘19 | 27-18387 | -39331 | -036787 | 2-54253 | -69  93-07536  -33371 | -010744 | 2-99658 
| 2-20 | 27-79726 | 0-39193 | 0-035975 | 2:55150 | 2-70 | 95-62799 , 0-33269 | 0-010457 | 3-00577 
-21 | 28-42711-|  -39055 | -035178 | 256047] -71.. 98-26016 | -33168 | -010177 | 3-01497 
-22 | 29-07391'| -38919 | -034395 | 2-56944 | -72 | 100-97461 | -33067 | -0°99035 | 3-02416 
‘23 | 29-73816 | -38783 | -033627 | 2-57842 | -73 | 103-77414 | -32967 | -0796363 | 3-03336 
| ‘24 | 30-42041 | . -38649 | -032873 | 2-58741 | -74 | 106-66167 | -32867 | -0°93754 | 3-04257 
2:25 | 31-12118 | 0-38515 | 0-032132 | 2-59640 | 2-75 | 109-64022 | 0-32768 | 00791207 | 3-05177 
| .26 | 31-84105 | -38382 | -031406 | 2.60540 | -76 | 112-71295 | -32669 | -0?88721 | 3-06098 
| -27 | 32-58060 | -38250| -030693 | 2-61440] -77 | 115-88308  -32571 | -0?86294 | 3-07020 
| 28 | 33-34041 | -38118 | -029994 | 2.62341 | -78 | 119-15401 | -32474 -0°83925 | 3-07942 
| +29 | 3412113 | -37988 | -029307 | 2-63242 | -79 | 122-52923 | -32377 | -0°81613 | 3-08864 
| 2-30 | 34-92338 | 0-37858 | 0-028634 | 2-64144 | 2-80 | 126-01238 | 0-32280 | 0-0°79357 | 3-09787 
-31 | 35-74783 |. -37729 | -027974 | 2-65046 | -81 | 129-60721 | -32184| -0°77156 | 3-10710 
| +32 | 36-59516 | -37601 | -027326 | 2-65948 | -82 | 133-31763 , -32089 | -0?75009 | 3-11633 
| +33 | 37-46608 | -37474 | -026691 | 2-66851 | -83 | 137-14770 | -31994 | -0272914 | 3-12556 
| +34 | 38-36133 | -37348 | -026068 | 2-67755 | -84 | 141-10162 | -31900 | -0°70871 | 3-13480 
| | | 
| 2-35 | 39-28165 | 0-37222 | 0-025457 | 2.68659 | 2-85 | 145-18375 | 0-31806 | 0-0°68878 | 3-14405 
| +36 | 40-22783 | -37097 | -024858 | 2.69563 | -86 | 149-39863 | -31713 | -0°66935 | 3-15329 
|. +37 | 41-20068 | -36973 | -024271 | 2-70468 | -87 | 153-75095 | -31620| -065040 | 3-16254 
-38 | 42-20103 | -36850| -023696 | 2-71374 | -88 | 158-24559 | -31528 | -0°63193 | 3-17180 
| ‘39 | 43-22974 | -36727| -023132 | 2-72280] -89 | 162-88761 | -31436 | -0°61392 | 3-18106 
| 2-40 | 44-28771 | 0-36605 | 0-022580 | 2-73186 | 2-90 | 167-68228 | 031345 | 0-0259637 | 3-19032 
| +41 | 45-37586 | -36484 | -022038 | 2-74093 | -91 | 172-63504 | -31254 | -0°57926 | 3-19958 
42 | 46-49515 | -36364 | -021508 | 275000 | -92 | 177-75156 | -31164 , -0°56258 | 3-20885 
-43 | 47-64656 | “36244 | -020988 | 2-75908 | -93 | 183-03773 | -31074 | -0°54634 | 3-21812 
-44 | 48-83112 | -36125 | -020479 | 2-76816 | -94 | 188-49965 | -30985 | -0°53050 | 3-22739 

2-45 | 50-04988 | 0-36007 | 0-019980 | 2-77725 | 2-95 | 194-14366 | 0-30896 | 0-0°51508 | 3-23667 
-46 | 51-30393 | -35889 | -019492 | 2-78634 | -96 | 199-97636 | -30808 | -0750006 | 3-24595 
-47 | 52-59442 | -35773 | -019013 | 2-79543 | -97 | 206-00458 | -30720| -0°48543 | 3-25523 
-48 | 53-92250 | -35657 | -018545 | 2.80453 | -98 | 212-23545 | -30632 | -0°47118 | 3-26452 
‘49 | 55-28938 | -35541 | -018087 | 281364 | -99 | 218-67633 | -30545 | -0°45730 | 3-27381 

2-50 | 56-69633 0-35427 0-017638 | 2-82274 | 3-00 | 225-33490 | 0-30459 | 0-0744378 | 3-28310 

*, xXx 
Z(X) = e-**"/,/(27), P(X) =1 -9x)= | Z(u) du. 
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Q/Z ZIP | IQ x QZ | 4I/P Z|Q 
3-00 0-30459 0-0244378 | 3-28310 3-50 0-26657 0-0387289 3-75139 
01 -30373 | -0743063 | 3-29240 51 -26590 -0384281 3-76082 
02 -30287 | -0241782 | 3-30169 -52 -26523 -0?81370 3-77026 
03 -30202 , 0940535 3-31100 -53 26457 -0°78551 3-77969 
-04 -30118 -0739322 3-32030 -54 -26391 -0°75822 3-78913 
3-05 0-30034 0-0238141 3-32961 3-55 0-26326 0-0°73180 3-79857 
05 -29950 -0°36992 3-33892 -56 -26260 -0°70624 | 3-80802 
07 -29866 -0°35874 3-34824 57 -26195 -0°68150 3-81746 
-03 -29784 -0234787 3-35755 -58 -26131 -0365756 3-82691 
09 -29701 -0233729 3-36687 59 -26066 -0363440 3-83636 
3-10 0-29619 | 0-0232700 3-37620 3-60 | 0-26002 0-0°61200 | 3-84581 
‘ll -29538 | -0731699 3-38552 61 |  -25939 -0359033 | 3-85527 
12 -29456 | -0930726 3-39485 62 | -25875 -0356936 | 3-86472 
‘13 | -29376 | -0929780 | 3:40418 63 | -25812 -0954909 | 3-87418 
14 | = -29295 | 0728860 | 3-41352 64 | -25749 -0352949 | 3-88364 
3-15 0-29215 0-0227965 | 3-42286 3-65 | 0-25686 0-0°51053 | 3-89311 
16 -29136 -0?27096 | 3-43220 66 | -25624 -0349221 | 3-90257 
-i7 | -29057 -0?26251 | 3-44154 -67 -25562 | -0347449 , 3-91204 
-i8 | -28978 -0?25430 | 3-45089 68 | -25500 0945737 3-92151 
29 | -28900 | -0924632 | 3-46024 69 | -25439 -0244082 3-93098 

| | | 
3-20 | 0-28822 | 0-0223857 | 3-46959 3-70 | 0-25378 0-0°42482 3-94046 
‘21 | -28744 | -0823104 3-47895 ‘71 |  -25317 -0°40937 394993 
‘22 | -28667 | -0922373 | 3-48830 ‘72 | -25256 -0339444 3-95941 
‘23 | -28590 | -0921662 | 3-49767 ‘73 | +25196 -0938002 | 3-96889 
24 28514 | -0220972 3-50703 74 | -25136 0936608 | 3-97838 
3-25 0-28438 | 0-0%20302 3-51640 3-75 | 0-25076 0-0°35263 3-98786 
6 28363 | -0719652 3-52576 ‘76 | -25017 -0333963 3-99735 
27 -28287 | -0719020 3-53514 77 24957 -0332708 4-00683 
2 -28213 | -0818407 3-54451 -78 -24898 -0331496 | 4-01632 
29 -28138 | -0717812 3-55389 79 -24840 030326 | 4-02582 
3-30 0-28064 0-0717234 3-56327 3-80 0-24781 0-029197 | 4-03531 
31 -27990 -0°16674 3-57265 ‘81 | -24723 * .0328107 | 4-04481 
-32 ‘27917 | -0716130 3-58203 82 | -24665 -0°27054 4-05431 
-33 -27844 | -0815602 3-59142 ‘83 | -24607 -0°26039 | 4-06381 
34 27772 | -0715090 | 3-60081 -84 -24550 0325059 | 4-07331 
3-35 | 0-27699 | 0-0914593 | 3-61020 3-85 0:24493 0-0924114 | 4-08281 
‘36 | -27627 | -0714112 | 3-61960 86 24436 -0°23202 | 4-09232 
37 | -27556 | -0713644 | 3-62900 -87 -24379 -0322322 4:10183 
‘38 | -27485 | -0°13191 | 3-63840 -88 -24323 0921474 | 4-11134 
39 | -27414 | 0912752 | 3-64780 ‘89 | -24267 0320656 | 412085 
j | | | 

3-40 0-27343 | 0-0712326 | 3-65720 3-90 | 0-24211 0-0°19866 4:13036 
41 -27273 | -0711914 3-66661 ‘91 | -24155 0719106 | 4-13988 
42 27203 | -0°11513 3-67602 92 | -24100 -018372 | 4-14940 
43 -27134 | -0711126 3-68544 93 | 24045 0717665 | 4-15892 
44 27065 | -0°10750 3-69485 -94 | -23990 -0°16983 4-16844 
3-45 0-26996 | 0-0°10386 3-70427 3-95 | 0-23935 0-0°16326 4:17796 
46 | -26927 | -0710033 3°71369 96 -23881 -0°15693 4:18749 
‘47 | -26859 |  -0396911 3-72311 ‘97 | -23826 -0°15083 | 4-19702 
‘48 | -26791 | -0°93601 3-73254 ‘98 | -23772 0914495 | 4-20654 
49 26724 | —-0°90394 3-74196 ‘99 | -23719 -0313929 | 421608 

0:26657 0-0°87289 3-75139 0:23665 0-0913383 | 4-22561 
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TABLES OF SYMMETRIC FUNCTIONS. PART V 
By F. N. DAVID anp M. G. KENDALL 


|. We write M for monomial symmetric functions, S for the one-part functions, U for 
the unitary functions and H for the homogeneous product sums. Previously we have given 
tables MS-SM (1949), UM-MU, HU-UH (1951) and MH-HM (1953). We now complete 
the set, up to and including weight (w) 12, by the US-SU tables. The HS-SH are the same 
as the present tables as far as coefficients are concerned. 


a 


2. The US-SU tables have been partially calculated when in the course of constructing 
the previous sets. There are many ways in which they can be built up. We have used here 
both the D operator technique and the elementary relations 
nia = x(— ] )*+8ateet --- 8K - ee: nh == —§ 971 37's 7 f- 
Ste Tie 12%... ke m,! my)... mle? 
+n, MatMet---+M,—1)!n 


a afiage... az 
! ! ! ’ 
11! 1,!...7;,! 





S 


‘a= x(— 1)* "35 --- 


the sum in each case being taken over all possible partitions. 


3. To express the S-functions in terms of the U-functions we read horizontally up to 
an including the diagonal figure in bold type. Thus for w = 5 we have, for example, 


(3) (2) = a} — 5a, a} + 6aza, + 3a,aj — 6a,a4. 


To: express the U-functions in terms of the S-functions we read vertically downwards 
up:to and including the diagonal in bold type and divide the coefficients by w!. Thus 


A3a..5! = 10(1)5— 40 (2) (1)? + 30 (2)? (1) + 20 (3) (1)?— 20(3) (2). 


4. We have called attention previously to the use of symmetric functions in distribution 
problems. These present tables will be useful for that purpose. They will also add consider- 
ab?y to the flexibility of the symmetric function system, for with this concluding set it will 
now be possible to express any symmetric function in terms of any other. 
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Tables of symmetric functions. 



















































































Table 5.2 Table 5.3 Table 5.4 
w=2 | a,* a w=3 @;* 4; a, w=4 a, a;a,* a,* G34, a 
a | ae |= 3 F ae |4t : 5 a 
} fat § ber 3 — . -12 —12 —12 —6 
(2) 4 os (2) (1) i = (2) (1) r cs A 
i 2 : om 3 
(3) ‘ ny 3 (2) ; uy r 
: (3) (2) =—.3 
I =} 3 
) = 
I -4 2 4 -4 
Table 5.5 Table 5.6 
| } 
w=5 | ay® GG," Gs"; GyG;* AyQ_ 24a, as w=6 | ay® aya;* ag2a,* Gy? gy” Gye, Gs" GyQ;* 2G, GsQ; ae 
(x) | 120 60 30 20 10 5 I (x) 722 360 = 180 90 120 60 20 30 15 6 I 
I I 
(2) (1)? | a —60 —60 —40 —30 | (| ' a —360 —270 —360 —240 —120 —180 —105 — 60 —15 
(2)* (x) | 30 3° «#«I5'= «5 (2)* (1)* | 180 270 180 180 9° «64135 go 45 
“4 ¢ - | —4 4 
2 | 40 20 40 20 * -90 —45 15 
GG) = 3 (2) = Iz =-8 8 
-20 —20 . 240 120 (oO 46240 «682120)=«—o«d120 4° 
(3) (2) — * 698 ‘ (3) (1)? } = er 
| -30 —30 “120 —240 —120 ~120 —120 
(4) (x) | lee sae pil “= (3) (2) (x) ag 6 s “on 
(s) | at (3)* se ; ” 
i. < “§ «6 -@ =p =f Te | 1 6 9 6 —18 9 
(4) x)" =iie. “99 ~180 '~99 
—4 2 4 -4 i 
90 
(4) (2)! I -6 10 —4 4 8 —4 8 
| 144 144 
Oo I ~ey 5 5 ~S ry, 5 
(6) | py 
| I -6 9 -2 6 —12 3 —6 6 6 =-6 
Table 5.7 
ng a Gy" gQs® —ag*Qy® gy Ag y* gg y® Og g® ag", 0g Qy® 4g, yg ys*® ggg, a, 
(1)? | $040 2520 1260 630 840 420 210 140 210 105 35 42 21 7 I 
| I 
(2) (x) | =2520 —2520 —1890 —2520 —1680 —1050 -—840 —1260 -—735 —315 —420 ~—231 —105 —21 
| I -2 
(2)? (x) | 1260 = 1890 1260 1470 1260 630 9045 735 630 525 315 105 
I —4 4 
2 —630 —630 —315 —315 =S55 ig. . —105 
(2° (x) I -6 12 -8 
(3) (x) 1680 840 420 560 1680 840 350 840 420 280 7° 
I =—3 3 
(3) (2) (1) — 840 —840 —1680 —840 -1260 -840 ~84c -840 -—420 
I —5 6 ° 3 -6 
2 420 210 420 210 
(3) (2) I -7 16 —12 3 —12 12 ‘0 ‘ P . 
2 5 560 280 280 
(3)* @) I -6 9 6 —18 ° 9 
| (4) (1)? -1260 -630 -210 —1260 -630 -630 -210 
rile F oS i 6 6 6 
3° ° 3° 3° 3° 
(4) (2) (x) ‘ i a a ‘ p ai 3 
- 420 —420 
(4) (3) 1 —-7 14 -6 7 —24 6 12 —4 12 -12 
(s) (1)* 1008 = 504 1008S 504 
1 “9 5 5 -§ =§ 5 
= 504 + “$04 
(s) (2) I —-7 1s —10 5 15 10 -5 10 5 -10 
(6) (1) —- 840 —840 
I -6 9 -2 6 —12 e 3 -6 6 ‘ 6 -6 
(7) 720 
I -7 14 -7 7 —21 » 7 —7 14 =—9 7 a “—- 7 
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Table 5.8 
w=8 (a) a;* @,a,° a,*a,* a,°a,;* a,‘ a,a,° 44,45 aata, a,*a;* a;*a, a,a;* 
cy | 40320 20160 10080 5040 2520 6720 3360 1680 1120 560 1680 
‘ Le 
(2) rye | -20160 -—20160 -—15120 -—10080 -—20160 —13440 — 8400 —6720 -3920 —10080 
i I -2 
(2)* ye | 10080 15120 15120 : 10080 11760 10080 8400 5040 
a I —4 a 
(2)? ze ye | -5040 —10080 5 p — 5040 ; — 5040 - | 
. I -6 12 -8 | 
: 2520 : : : : : «| 
I -8 24 —32 16 | 
13440 6720 3360 4480 2240 13440 | 
| : I -3 ° : . 3 
sya -6720 -—6720 -13440 —8960 é] 
| (3) (2) £0? | | 
: I —§ 6 pA e 3 -6 wr ‘ 
ave) | 3 : 720 . 
(3) (2) i) * ~7 16 —12 A 3 —12 12 | 
(3)? €x)* | <= 
: I —6 9 " . 6 —18 . 9 | 
2 = 2240 - | 
| (3) @ | I —8 21 —18 ‘ 6 —30 36 9 -18 
| @ine | : ls ‘ ‘ oe 
| (4) (2) (1)" 1 -6 10 = : 4 -8 3 i =4 
(4) (2)? | I -8- 22 —24 8 4 16 16 - ° —4 
(4) (3) @) | I ~% 14 —6 7 —24 6 12 —4 
( I -8 20 —16 4 8 —32 16 16 -8 
| (s)(1)? | I -5 5 ‘ 5 -5 - -5 
| (5) (2) (x) | I —2 15 —10 5 —15 10 - : -5 
(5) (3) I —8 20 —I5 8 —35 30 15 —15 —§ 
(6) (1) I —6 9 —2 6 —12 ; 3 “ -6 
(6) (2) I —8 21 —20 4 6 —24 24 3 -6 —6 
(7) (1) I - 14 =—— 7 —2I 7 7 : = 
(8) I -8 20 —16 2 8 —32 24 12 -8 —8 | 
u: =8 (ii) | CC a a, G4, GyQyQ, gy aa," Ma, Ay a 
| 
aa . 
} = (1) 840 420 280 7O 336 168 56 56 28 8 I 
| (2) (1) —5880 -3360 -—2520 —840 -—3360 —1848 —728 —840 = ~ — 168 —28 
{2)* (1)* 7560 6720 5880 2940 5040 4200 2520 2520 1680 840 210 
{2)* (1)? | —2520 —5040 —2520 —2520 + —-2§20 —2520 —840 -—1680 —840 —420 
hal 3 (2)* e 1260 ° 630 “ ° ° 420 ° 105 
| -3(3) (1)* 6720 3360 2800 1120 6720 3360 1232 2240 1120 560 112 
| (3%:(2) (1)? | —6720 -6720 —10080 -6720 -6720 -6720 -5s600 -6720 4480 -3360 —1120 
(32.(2)* (1) P 3360 1686 3360 . 3360 5040 ‘ 3360 1680 1680 
4 ne ; ‘ : 4480 4480 * ‘ 2240 2240 1120 2240 1120 
:(3)* (2 a d ‘ " . - 2 » —I120 - —II2z0 
(4) (1)* —5040 -—2520 —1680 —840 —10080 —s040 -1680 ~—5040 -2520 -—1680 —420 
(45 (2) (x) 5040 5040 5040 5040 ‘ $040 5040 5040 5040 5040 2520 
A 8 
S “i — 2520 + 2520 d d é + 2520 + 1260 
=(4) (2) sé = eae * a ¥ 
= -33 —672°0 : * <2 . oo nemgg 39 
( (3) (1) i : = 38 
: (4)* 2520 : . . : . . 1260 
: 4 32 —16 —32 16 
Cs) (x)? 8064 4032 1344 8064 4032 4032 1344 
. ‘ . = : 5 
~ = 4032 —4032 + 4032 —4032 —4032 
(s} (2) (@) 0 ’ J 5 =%0 oa 
“3 2688 ° ° e 2 
= (5) (3) sie 
1s . —15 e 5 1s 15 - 
6) (1)* -6720 -3360 -6720 —3360 
. 6 ° ° * 6 F F -6 ae in 
3360 . -# 
(6) (2) 18 —12 ° e 6 —-12 - ° -6 12 ot 
5760 57 
(7) (1) 4 : -7 d 7 -7 . = . 7 
(8) = 3040 
24 -8 —16 4 8 —16 8 -8 8 8 -8 
15 Biom. 42 
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Table 5.9 
' 
to =9 (i) a,’ @,a," a,*a,* a,’ a,* ay‘ a; a,a,* @yQ,4;* a3a;°a,"  a,a,° @,*a;° Gy 3 G> gy gy, ! 
J 362880 181440 90720 45360 22680 60480 30240 15120 7560 10080 5040 1680 15120 7560 
(a) 
I 
(a) (x) = 181440 —181440 —136080 —90720 ~—181440 —120960 —75600 —45360 -60480 —35280 —15120 —90720 —52920 -4 
I -2 f 
Gays 90726 6136080 8136080 90720 105840 90720 90720 6975600 45360 45360 68040 : 
j 1 -4 
(2)* (x) = 45360 —90720 —45360 —75600 —45360 —45360 — 22680 -§ 
» I -6 12 = 
22680 22680 i 
“ ——= 
(2)* (2) I -8 24 —32 16 i 
(3) @)* 120960 60480 30240 15120 40320 20160 10080 120960 60480 ' 
| I —s 3 
| (3) (2) (2)* | - 60480 —60480 —45360 —120960 —80640 —60480 — 60480 1 
I -S5 6 3 -6 
| 2(7)2 30240 45360 60480 90720 ‘ 
|(3) (2) (x) I -7 16 —12 3 —12 12 | 
(3) (2) ‘ -_ 30 sonia 24 3 —18 36 -24 60 60 
40320 8201 201 
| (4)* (1)? 
| (3)* (1) , = e 6 aa : ft 2 
s 201 — 60480 
(3)* (2) @) I -8 21 —18 6 —-30 36 9 -18 
2 13440 P 
| @) 1 -9 27 —27 9 — 54 81 27 —81 27 “a } 
90720 —45360 -j 
1 S| 
{ (4) (1) 5 aa 2 4 =a ia 
| 2 45300 § 
(4) (2) (2) | x é fe iH 4 - J 
| (4) (2)* (2) | I -8 22 —24 8 4 —16 16 —4 16 
} Ba | I ee | I -6 , 7 —24 6 ; 12 - -4 12 
| (4) 1%) | I - 2 — 34 12 4 —38 54 —12 12 —24 “4 20 
a Y I - 20 —16 4 —32 16 : 16 ‘ - 32 I 
5) (1 I =—< 5 : . 5 —-§ ° ° : —5 ‘ 
| (8) 2) (3) | I -7 15 —10 ‘ 5 —15 10 " : -§ 10 } 
8 2)! I - 29 —40 20 5 —25 40° —20 * 3 -5 20 | 
(s) 3} (1) | I - 20 —15 ! 8 —35 30 4 15 —15 —§ 15 
(s) (4) | I = 27 —30 10 9 —45 50 —10 20 —20 -9 40 
| (6) (x) | 1 -6 9 -2 , 6 12 ‘ . 3 : : ~6 6 
| (6) (2) (x) | I -8 21 ~20 4 6 —24 24 4 3 ~6 : -6 13 | 
OS | I -9 27 —29 6 9 —48 63 -6 21 —45 9 -6 4 |} 
2? I -7 I —7 ‘ 7 —21 7 a 7 “ . —9 14 | 
2 (2) I -9 28 —35 14 7 —35 49 —14 7 —14 ’ - 28 f 
(8) (1) 1 -8 20 —16 2 8 —32 24 J 12 -—8 7 24 
(9) I -9 27 —30 9 9 —45 54 a 18 —27 3 —— 30 | 
—— —S -_ 
w= 9 (ii) | Ggaya;"  GyGgay aia aa 44,4," asa," asa3a a3a a,a,;* 44,4 aa a,a,* a;a, ' 
“ a a 1 6 2 15 222, 15 Ay 54g A, 5s Ge 6 Ay 6 2g; 6 As in @y 7a, aya, } 
— —— —- —-- — — ——__—— —— -—4 
(1)? 2520 1260 630 3024 1512 756 504 126 504 252 84 72 36 9 | 
(2) (1)? | —22680 —12600 -—7560 —30240 —16632 -—9072 -—6552 -2016 -—7560 —4032 —I5IZ2 —1512 —792 — 252 
(2)* (1)* 52920 37800 26460 45360 37800 27216 22680 9828 22680 15120 7560 7560 4536 1890 
et) —22680 —37800 — = oO —22680 —30240 —22680 — a —7560 — 15120 ~ —- —-7560 —- 7560 —3780 -1 
2y°(1 ‘ 11340 5670 . 11340 . 5670 . 3780 3780 " 3780 94 
(3) (1)° 25200 12600 10080 60480 30240 15120 11088 3528 20160 #10080 3528 5040 2520 1988 
(3) (2) (1)* | —90720 —57960 —60486 —60480 —60480 —45360 —50400 —27720 —60480 —40320 —22680 —30240 —17640 —10080 -i 
(3) (2)? (1)? 15120 52920 30240 30240 45360 45360 37800 30240 37800 15120 22680 15120 J 
(3) . = 7560 . . —43§120 - —7560 < - 2520 ~—7560 . a 
(3)? *| 40320 20160 40320 20160 20160 20160 10080 10080 20160 10080 10080 |; 
(3) a) . 20160 ‘ —20160 —20160 . —10080 ~heees — 10080 — 10080 -i 
. ° . : ¢ ; 7 J J 720 > R 
{a ifs —15120 -—7560 -7560 —90720 —45360 —22680 —15120 -—4536 —45360 —22680 -—7560 —15120 -7560 -—3780 j 
| (4) (2) (1)? | 45360 30240 ©. 445360 45360 45360 45360 30240 45360 45360 30240 45360 30240 22680 4 
(4) (2)* (1) - 22680 —22680 — 22680 . 22680 . 22680 —22680 . ~—22680 —11340 -I 
| (4) (3) @)* = 30240 —15120 —60480 —30240 —4536¢ —1I§120 —30240 —I5120 ~— 30260 -Kf 
| -12 
| 15120 15120 15120 15120 | 
(4) (3) (2) pm fs a 
(4)* a) 22680 22680 11340 4 
—32 16 
(s) (x)* 72576 36288 18144 12096 3024 «= 72576 «= 336288 12096 36288 18144 12096 3 
5 
(s) (2) (1)* — 36288 —36288 —36288 — 18144 —36288 —36288 — 36288 —36288 —36288 -1 
5 -10 
s 18144 9072 18144 ‘ 
(5) (2) ; -—— Ss 
24192 24192 24192 24192 
1 —_—_ 
(s) (3) (1) oe P =s8 is 2 
— 18144 . a 
(s) (4) —40 20 20 5 —20 10 20 -20 
(6) (1) ~ 60480 —30240 —10080 —60480 —30240 — 30240 -! 
6 -6 
30240 30240 30240 
(6) (2) (x)  — = 12 = i 
— 201 . si 
(6) (3) —18 18 6 —18 18 —6 18 -18 es oak 
51 25920 = 51 y 
I 2 
(7) @) -7 7 ~7 ’ -~7 7 
(7) @) -7 14 ° 7 —21 4 ° ° -7 14 7 -14 
@))  _46 ' 4 8 16 : 8 oo < 8 4 , 
(9) ~-a 18 9 9 —27 9 18 “9 -9 18 —- 9 —-9 -9 














—252 
1890 

— 3780 
94 
1988 

- 10080 

15120 


10080 
- 10080 


~ 3780 
22680 
- 11340 
- 30260 
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: Table 5.10 
ok 
—s @) a,” a,a,° a;*a;* a,>a;* a;*a,* a,* aa," a,a,a,;° @:0,"a;7 ag, 
: (x) eee 1814400 907200 453600 226800 113400 604800 302400 151200 75600 
ia) (a)? . - oe —1814400 —1360800 — 907200 —567000 1814400 -—1209600 -—756000 453600 
23) (a) 907200 1360800 1360800 1134000 907200 1058400 907200 
= I —4 a 
| 34)* aye : 3 eee 907200 -1134000 —453600 —756000 
$ ~~ 12 = 
| Saya) 226800 567000 3 226800 
o I -8 24 —32 16 
(2)! — 113400 3 
2 I -10 40° —80 80 -32 
(3) (x)? | 1209600 604800 302400 151200 
S | ; va ? 
($@) ay | ; : : —Seshce ~~ Goafion — 453600 
~ | ce 3 - 
(3 (2) (a)* sen4ne 453600 
- I -7 16 —12 3 —12 12 
(3 @)@) - . ‘ = 18 L200 
res - 2. - 
-2(3)* (1)* I —6 . i“ . 3 - 8 ™ 4 
(3° (2) tise I -8 21 —18 ‘ 6 —go 36 i 
| I —10 37 —60 36 6 —42 -72 
3 3)* (1) I -9 27 —27 . 9 — 54 ; . 
~ (4) Gy) I —4 2 . : A : 
(3) (2) Hf 1 -6 10 —4 4 . 3 -8 ‘ ‘ 
(a: (2)* (1)* I -8 22 —24 8 < 4 —16 16 4 
4} 2)* I -10 38 — 68 56 —16 4 24 48 -32 
(«) (3) G)® I -7 v4 -6 ‘ee : 7 —24 6 f 
(4.43)(2)(@) I -9 28 — 34 12 . 7 —38 54 12 
“ (4) (3 I —10 35 —48 18 e 10 — 66 120 —36 
= (4)? Ga)? | I 8 20 —16 4 . 8 —32 16 q 
; (4)? (2) | I —10 36 —56 36 -8 8. —48 80 —32 
>. (5) (1° I -5 5 , . ue ' 
(s Gf I -7 15 —10 ‘ : a 10 C 
‘s 2) (1) I -9 29 —40 20 5 —25 40 —20 
‘s) (3) (1) 1 -8 20 —I15§ » 8 —35 30 . 
£5) 3) {2} I —10 36 —55 30 8 — 51 100 — 60 
{()@a I -9 27 —30 10 9 —45 5° —10 
yi (5) I -10 35 —5° 25 10 —60 100 —50 
° (6) (1)* I -6 9 -2 ‘ 6 ~- > 
to) (3) 1)" I -8 21 —20 4 ‘ 6 eas 24 i 
(6) (2)* I —10 37 —62 44 -8 6 —36 72 — 48 
76) (3) {} I -9 27 —29 6 ° 9 - 6 -6 
3 (6) (4 I —10 35 —50 26 -4 10 - 9 —32 
(7) G) I ne 14 Fi. . . 7 21 
oD) 3} {} I =9 28 —35 14 . 7 —35 0 -14 
: 3 I —10 35 —-49 21 ° 10 —63 112 —42 
. ah cs I -8 20 —16 2 ° 8 —32 a 
pe {8} §) I -—10 36 —56 34 -4 8 —48 Fr —48 
- OU I “> 27 ~2e 9 . 9 —45 54 -9 
: (10) I —10 35 —50 25 —2 10 —60 100 —40 
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Table 5.10 (cont.) 




















a," a,4;" 


aa," 





3) (1)" 

G) Gt 
g (2)* (1)* 
3) (2)* (1) 
(3)* (x)* 

(3)* (2) (1)* 
(3)* (2)* 

(3)* (x) 

(4) (x)* 

(4) (2) (x)* 
(4) (2)* @)* 
(4) (2) 

(4) (3) G@)* 
(4) (3) (2) @) 
(4) 3), 


(4)* (1)* 
(4)* (2) 


of 


OM 
) Cy 
oe 





50400 
— 352800 

756000 
— 453600 


201600 
— 806400 
604800 


201600 


- 201600 


-18 
—36 


—81 


25200 
— 201600 

554400 
— 604800 


100800 
— §04000 
705600 
— 302400 
100800 


— 201600 


100800 
36 





21 


16 














151200 
— 453600 
453600 
— 151200 


— 113400 
340200 
— 340200 


113400 


32 


32 
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= 10 (iii) @,a;* aa," i'd, a,a,* Gy0,Q,° gy, 4gg;*” ag gy Bs Qgay a; @,a;* 
(1)** 4200 6300 3150 30240 15120 7560 
5 5040 2520 1260 
G 4 — 50400 — 75600 —40950 —30: — 166320 —90720 — 65520 _ — 20160 - oun m- 
2yi (xe 201600 170100 453600 378000 272160 226 146160 280 Re 
(2)* 1)¢ — 302400 -—226800 -—245700 —226800 302400 -226800 -226800 ~—151200 -75600 —75600 
(2) a 113400 56700 “ee 113400 113400 56700 56700 . 
(3) (1) 50400 100800 50400 604800 302400 151200 110880 
* 55440 35280 10080 201600 
( 1S Be en —604800 —352800 —604800 — 00 —453600 — 504000 39% Oo —277200 —110880 -— 604800 
t ss 55200 302400 453600 302400 453600 453600 478800 378000 252000 ° 
33) (1) | —151200 .  ™I§1200 — 151200 - —226800 —75600 —151200 ° 
ig) 2) Re 5 Saaene 403200 201600 201600 100800 201600 100800 201600 
o 3 ag soo + 201600 —201600 ~—201600 —201600 rae . 
2) ; s ‘ I 100800 > 
: (4) (1)* — 25200 — 75600 —37800 -—g07200 453600 -—226800 —151200 - 
4 75600 — 45360 —I5120 — 600 
oY) (2) i 151200 453600 264600 ¢ 453600 453600 453600 302400 302400 setae Senseo 
4) @ ays —226800 -—226800 —340200 . —226800 - —226800 -—226800 —226800 i 
. ° . 113400 . > 
- 4).G).(1)® | —100800 —604800 —302400 ~ = : : : 
i 302400 151200 —453600 — 302400 
o * ; i 30: 0° 302400 151200 151200 302400 
4) (3 : ; ; ) 
- 36 
(4)? @)* oes 113400 226800 226800 
I 
(4)* (2) oe st 
(s) (x) 725760 362880 181440 120960 60480 30240 12096 725760 
(s) (2) (a) = 362880 -362880 -—362880 -—241920 -—181440 —120960 
5 -10 
(5) (2)* (x) 181440 181440 90720 181440 
5 —20 20 
(5) (3) (@)* 241920 120960 241920 241920 
5 -15 ‘ 15 
(s) (3) (2) see og 
5 —25 30 15 30 
(s) (4) (x) aitiate ~ 362880 
20 5 —20 10 20 -20 
. 145152 
(s) 25 10 —50 50 50 —50 —50 25 
6) (1)* ad 
(6) (x) n 6 x -6 
(6) 2 I “ é 6 —12 s -6 
2 e 6 = 24 ° - 
(6) £3 fr) : 5 $ 6 ~%8 2 18 ‘ = 
4) —12 24 —24 6 —24 12 24 —24 -6 
(7). Gy . ° 7 -7 F { = 
(7) a fr) é . e 7 —21 14 : ‘ i 
3 —21 ° . 7 —28 21 21 — a3 : - 
(8) (x)? 7 4 2 8 —16 ‘ 5 . - 
f 3 - 4 -8 8 —32 32 8 —16 ¢ -8 
9) (1 . 9 . 9 —27 9 18 : —9 . =9 
(10 —10 15 —10 10 40 30 30 —20 —20 5 -10 
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w = 10 (iv) | 


dy 








” 
aye 


i 


eee 


(0) 
(4) G) 2) (1) 
@ (3)" 


( 2 2 
ty (2) 


is b vay 
5) 33 {3 
(s)* 

(6) (1)* 


(6) (2) (1)* 
(6) (2)* 
(6) (3) (x) 
(6) (4) 

(7) Gx) 
(7) (2) @) 
(7) (3) 

(8) (1)* 
(8) (2) 

(9) (1) 
(10) 








12 
— 362880 


241920 


— 100800 
302400 


- 201600 
-. 


—24 


—2i 








Table 5.10 (cont.) 
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Table 5.11 
es w=rrt (i) a," aa," a,* a," a,*a,* a,‘ a,* a,* a, aa," @,a,a,° @,a,*a,* @,a,°a;" a, a," 
: () 39916800 19958400 9979200 4989600 ©=_ 2494800 1247400 6652800 3326400 1663200 831600 415800 
—45 I 
630 (2) (x) = 19958400 -— 19958400 —14968800 — 9979200 —6237000 —19958400 — 13305600 —8316000 -— 4989600 —2910600 
3150 I -2 
4725 ) (a)* (a)’ 9979200 14968800 14968800 12474000 3 9979200 11642400 9979200 7484400 
I —4 4 
240 (2)* (1) sue —9979200 — 12474000 3 - —4989600 ~—8316000 —9:47600 
5040 I —6 12 = 
5200 (2)* (x) 2494800 6237000 . . . 2494800 5405400 
nd : I -8 24 —32 16 
. 5 ie E2ETq0O — 1247400 
oes (2)* (1) I —10 40 —80 80 32 
5 = 13305600 6652800 3326400 1663200 831600 
2400 (3) (1) A my ; ‘ ——— 
1260 ; * ~ 6652800 -—6652800 —498 9600 — 3326400 
8900 (3) (2) (1) : p 6 <2 
67700 : ; ; , ' i 3326400 989600 989600 
8900 (3) (2)* (1)* : ee ~ a : ‘ ate — aheees 7 
iso - 166, — 3326400 
1200 (3) (2)* (1)* —— 3 
~— I -9 30 —44 24 . 3 —18 36 24 lene 
}0'700 4 
67700 s ol ‘ = tr 4 — 104 112 —48 3 - 24 72 -6 7 
I ~ ° ° . =i . . 
a 3)? (2) (1)? I -8 21 —18 6 —30 36 ; | 
Sree | a Gh arn I —10 37 —60 36 6 —42 96 -72 
072 j I I -9 27 —27 9 — 54 I x | 
wee | 3° @) : ee 45 —81 54 9 —72 189 — 162 
4) (1 - 4 ° ° ° 
1440) | (4) {5 I —6 10 —4 4 —8 
72576 (4) (2° (1)? I -8 22 —24 8 4 —16 16 
leis (4) (2)* (x) I —10 38 —68 56 —16 4 —24 —32 
5 
p1200 | (4) (3) (1)* I =9 14 —-6 7 —24 6 
1 | (4)(3) (2) ()* I -9 28 —34 12 7 —38 54 —12 
75600 4) (3) (2)* I Ir 46 —90 80 —24 7 —§2 130 — 120 24 
i 4) (3)° (1) I —10 35 —48 18 10 —66 120 —36 
aaa I 4)" (1)? I -8 20 —16 8 —32 16 < 
peue (4)? (2) fr) I —10 36 —56 36 -8 8 - 80 —32 ° 
kil | 4)" ) I —fr 44 -76 52 —1I2 Ir - 172 —96 12 
51 I I -§ 5 F 5 - ‘ ‘ 
(5) 33 i I —7 15 —10 5 nan 10 
36400 f (5) (2)* (1)* I =< 29 —40 20 5 —25 40 —20 
2) I —IL 47 —98 100 —40 5 —35 90 — 100 o 
59200 | | (s) {3} 2 I 8 20 —15 , i 8 —35 30 / m 
(5) (3) (2) (x) I —10 36 -s5 30 8 —51 100 —60 
72800 5) (3)* r —25 44 -75 45 Ir —83 195 —135 
' , ) (x)? I -9 27 —30 10 . 9 —45 5° —10 
26800 5) (4) {2) t tr 45 - 7o —20 9 —63 140 —110 20 
3 1) I = 35 —50 25 P " —60 100 —50 
I I - -2 —12 ‘ 
26800 > (6) 2 1) I -8 28 —20 4 6 —24 24 
j 6) (2)* (1) I —10 37 62 44 —8 6 —36 72 —48 
03200 
\ (6) (3) (1)* I —— 27 —29 6 9 = 63 ~6 
62880 (6) (3) (2 I —1 45 —83 64 —12 9 - 159 —132 12 
(6) (4) (x I —10 38 —50 26 —4 ro —60 96 —32 
—— 6) I —2t 44 —7 5S —10 Ir —77 165 -1I5 10 
= 7 i I —9 it ~% 7 —21 7 . 
(7) (2) (x) I =a =“ 14 7 —35 49 ~14 
2)" I —1r 46 —9or 84 —28 i - 119 —112 28 
(7) 3 AY I —10 35 —49 21 10 —63 112 —42 
} 4 I -IL 44 —77 56 —14 Ir -77 161 —98 14 
| ( West I -8 20 —16 2 8 —32 24 
(8) (2) (x) I —10 36 —56 34 - 8 -48 88 —48 
{3 ) I —t 44 —76 5° 3 Ir —80 180 —120 6 
‘ ( ya) I -9 27 —30 9 —45 54 -9 
&) 2 I —1Ir 45 - 69 -18 9 —63 144 117 18 
(10) 3 I —1t0 35 —50 25 -2 10 —60 100 - 
(11 I —Ir 44 —77 S55 —11 Ir -77 165 —110 11 
, 
5 
} 
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w=11 (ii) @,*a,;* @,*a,a,° @,* a," a; @,*a,* @,* dy aa," @,a,a,° @,a,*a,° 4,0,° a, QQ_a;* gy? | w=1 
4 ©8800 66 831600 800 8600 | 
a ~66s2800 - sBiotoo Seatstee = paeees vonenee = nase - Guise - 33ab400 - ibptees: ~ 2494800 - “4 { ( 
(2) 1)’ 9979200 316000 4989600 33ztgo0 4989600 Fyest og ws 4 5821200 4158000 (2 
(2) 1) - —4989600 -—665 —4989600 —498 - 24 —4989600 —5821z200 -—2494800 —4158000 (2 
an to) . . 24 ‘ 94800 1247400 5 1247400 | \ (2 
eB | tes ggted tutes _ yates gaye) sts ted pats BESS sree =| a 
Shae {2}, tt | | . 665 7761600 9979200 8316000 7 ; 3326400 4989600 1663200 5821200 g { 
(3) (2)? (1) | . + 332 + —4989600 + —1663200 4 — 831600 i {3} & 
(3) (a) | " . ; P B : : - : ° ; | € 
(3)* (x)* | 4435200 2217600 1108800 2217600 1108800 : 3 . . 4435200 sais | : sf 
| | 3)° ¢ 
(3)* (2) (x)* | : = 2217600 2217600 -—6652800 —4435200 . P Z - — 2217600 | | (3)? ¢ 
3 | 9 -18 i @ 
(3)* (2)* (2) ‘ sas ee ‘ 3326400 y . - ° ° ‘| *: 
(3) (a) | 1478400 739200 | , (2 
| 27 —81 J S ‘y 4) ( 
| — 739200 ‘ i : A , Bl 
(3) (2) | a ~oeg ake qe "ie IP 3 f 
4) @"| = 9979200 4989600 -2494800 ~—1247400 -—1663200 —831609) | (4) (3) ( 
4 : : : i . =4 i) @G 
(4) (2) ()* | : i aes 4989600 3742200 4989600 eis wt 
— 2494800 — 3742200 - —2494800 » 
(4) GP Gr)? | d J é = 3 -4 16 -16 | 4) 
| | 1247400 ‘ a 
(4) (2)? (x) e . ° ° e —4 24 —48 32 ‘ al (4)? ( 
«| 33 — 
(4) (3) @) | 12 ; “ , ‘ -4 12 ‘ -12 i G 
eal 1663200 
(4) (3) (2) (2)* 12 - J ; J —4 20 —24 4 —12 = | (5 
{4} (3) (2)* 12 4 48 : . -4 28 —64 48 —12 8 
(4 J) 3 — 108 18 36 . = ~ -% : 72 Ay) (s)(2 
6 _ < P. q —8 —80 2 - i 
i (4 : &} ps “1464 48 48 d -8 % —112 38 36 at (s) (2) 
(9) a , : e : : a 10 : : ° f iN (s 
§ (2)? ¢ 4 e . —-5§ 20 —20 é ‘f | 
ts da ' P ‘ : id 30 ~60 40 | &G 
treaty 15 45 9s : ; = 18 ae ; tS 3} OO 
{5} {3}, ts 39 105 135 45 —45 =—s 30 —45 “8 bod I (5 
(9) ¢ I 20 * r ° : -9 —30 —40 2)! (5) 4. 
) 4} 6 } 20 - 40 é . -9 58 —110 60 —40 ror) (5) 
8) (x) 7 he _ : = % Lae Te [ f§ 
8} 2 fr, 3 -6 F ‘ d —6 18 -12 é 4 HY 6) (2' 
* (1) 3 —12 12 ‘ F —6 30 —48 24 > ‘ 6) 2 
Qo ) a) 21 —45 9 P -6 —18 -18 a) (6) (3) 
we re | *.. % — oe aa S a) 0 
: 33 —105 75 15 -15 a res —105 40 = : 120 & 
1)* F ° - 1 FE - 
(7) 2 nt 7 71 é : = a a sé 7 | og 
(7) 3} n) ab —84 21 21 : -7 35 - -2b Si) OG 
f 35 112 42 28 Ir Jo —126 56 —63 1 
I 12 - . ~ 24 - : -% (8) 
8 2 8 8 8 6 
“Oe ££ = 8 s oO oe oe 6g: BOR ws 
( Xn 18 ~27 : 3 . -9 36 ~27 é —27 18! (9) 
oo) 2) 18 —63 54 3 =" -9 54 —99 54 —27 Fi ts 
(10) 25 —60 I 10 é —10 ° —60 10 —40 j (10 
(11) 33 —110 68 22 —1Ir —11 & —110 4 —s55 134 
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ay 3 - ssa 2 
@qQq0q0)') | wy =11 (iii) 4,2; a," a; aa,;* a, a,a, a,> a, @,a,° @,a,a;* @,a,°a,* @,a,* @,a,a,° 545 2,0, 
138600 | (18 69300 46200 6 5440 

= 19300 34650 11550 2 66 
“I 8000 | (2) (xy | = 762300 ~554400  —831 — 450450 =—173250 — s3aqe0 = 182 320 aonneas ‘1 stone -73 é 
pei se (2)* (1)? 2772000 2217600 sake 1871100 4989600 415 2993760 1 “= be os % 

—415 (2)? (1)° —4158000 — 3326400 Moe <= —2702700 — BP ont pe - =2494800 -— 3326400 ae wae : 
1247400)\ ‘eo ‘ » Rr we 1247400 23700 1559250 1351350 : : 1247400 2286900 sees * 1247400 
1386000 | 1)* 693000 554400 1108800 a 2 aa ¢ 5 4c ge 9680 609840 

= $ 75600 3 Ge ( ye —3880800 - 3880800 — 66sa800 7 3B8eko0 - 1g¢aq00 Gosa8o0 ~88: ~ tees - sszbqoo - s5q4000 3381840 

5 I 6098400 7207200 3326400 89600 4851000 ° 3326400 9600 % 66800 
831600 | (3) & ae —3326400 — 1663200 won a 3200 —2494800 y ? Ri ew eee ? > Be-+ey- ~o4 
+i (3) (2)* 415800 ° P 5 
2217600 | | (3)? (1)§ 1108800 1663200 4435200 2217600 saab Sa1690 7600 08800 
|| (3)? (2) ()* | -2217600 + —5§544000 - —2217600 4435200 : Snabebns 600 oa 

— 2217600 |) (3)* (2)? (1) TI08800 554400 FS . 1108800 é 5 ; ; oe aoe 

5 as t 2.) 1478400 : ‘ 1478400 4 - ss , ; ‘ 
“ih 1)?| 415800 -277200 —831600 —415800 138600 — © 4989600 —2494800  — Doles 5 5 
| %) ate its 2079000 1663200 4989600 2910600 1247400 3 Spantoo 4989600 ro need poe tised sarees 
« (2) ‘f I —2910600 -—2494800 -—2494800 — 3742200 —2910600 y + —=2494800 ~ pine ned vs - iat 

1 4) (2)* (1) | 1247400 . : 1247400 1247400 : : . 1247400 . 

| (4) (3) (1) | —831600 —1108800 — 6652800 -— 3326400 —1386000 6.400 
1663200 3326400 ‘ 3326400 ported i ; ‘ 1 t ee 
2| 831600 - ; - —831600 ; i : . _—— 

soll q (4) (3) (2)* | = 
3326400 | = 2308800 : 
\ (4) (3)? (1) — ; - 2217600 
— ] sd 
2494800 sai 2494800 1247400 415800 
(4)* (x) —_ 
oF 1 4 5. 
}} (@)*(@)@) | 6 eS AIM 
— 1663200 : : 
(4)? (3) | =] 
1663200) ce cd * oe ™ 
a (5) (x)* —7083360 3991680 1995840 = 997920 © 1330860 665280 
a ° ; ss : 5 
2 (s) (2) (x)* 3991680 -3991680 -2993760 -—3991680 —2661120 
; r A i ‘i = 5 -10 
bl (5) (2)* (1)* 1995840 2993760 é 1995840 
Be | 8 i ‘ 5 —20 20 
i GS) —297920 
| - e ° * < 5 —30 60 -40 
}| ()@)@ 2661120 1330560 
a ° " . ; ; 5 —15 ‘ F 15 
#}| G)@)G) ae 
: . , — i —2 C) ° - 
bad I (5) (3)* : —45 . 5 ~ 30 4s 30 = 50 
®T! (5) (4) (x)? ; ‘ 20 5 —20 10 M 20 
woh) (5) @) (2) —40 - 20 —40 5 —30 50 —20 " 20 -40 
§ (x “ . 25 ; 10 —50 50 a 50 —50 
a 6) ry, “ . 6 “ 
‘D 6) 2) 1) : ; : 6 -12 ‘ 
i 6) (2)? (1) P 6 ‘ 6 —24 24 
Bh} (6) (3) (x)? ; 6 —18 8 
bi 3) (2) —36 : : 6 - - = 
n 8 Be a —12 24 —24 6 as * va * 
120 Oe —30 —15 30 —30 Ir —60 75 —10 60 —9o 
' . . . . 7 -7 . i 
S} (7) (2) (x)? : . 7 —21 
FI o a —28 : 7 —_ 56 —28 ! 
P| (7) {3} (x is - P é : . 7 - 2x a 21 —21 
a ee si 2 Pa = ll po 
if (8) (x } ‘ : 4 : : & a = ~ 4 7 
KS (8) (2 . " 4 -8 s & —32 2 ‘ - 
7 ‘BG —24 —48 4 —12 12 & —- 3 . sf 
. é 9 4 9 —27 ‘ 18 
ay (9 3 —36 : 9 —18 9 —45 63 —18 18 —36 
i (10) (1 . —10 15 -10 A 10 —40 " 30 -20 
ie! || II —33 —33 22 —33 11 11 —55 1x 44 —66 
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w= 11 (iv) a,a,* G,a,a," O54, 0, as? i, G,a,° ga," Q,a," a, a.a,4;" CA ea 4a, ' 
1) 9 13860 S930 2772 55 27720 13860 9240 4620 2310 
(2) (1)? | —147840  -—221760 -—117810 —55440 —831 3520 —235620 —166320 —87780 —48510 — 11550 
2 . id 7761 1081080 651420 pane ‘aoe 1663200 1053360 oe 498960 318780 o7ees 
2)? (1) | —1663200 —1663200 —1372140 —831600 _— —1663200 -—1663200 -1386000 -~1108800 — 762300 — 318780 
j , 1247400 623700 114 . 623700 415800 1039500 415800 googoo $1975° 381150), 
f G ‘ a —311 J — 207900 — 207900 — 103950 — 103950; 
221760 388080 togo4o 110880 221 7600 1108800 554400 388080 194040 110880 2 \ 
"| —1774080 —3 —1718640 —1219680 -—6652800 -—4435200 —2772000 —2494800 -—1441440 —1108800 —3 i 
js 4435200 415: 3603600 2772000 ° 3326400 3880800 4158000 2326400 2772000 1386000 
js — 3326400 —831600 —2494800 — 1663200 + 1663200 —277200 -2217600 -—1108800 —1386000) 
‘ ap i 415800 : 6 ‘ ; 138600 3 1860s 
: 776160 2217600 1108800 1108800 2217600 1108800 554400 1108800 554400 831600 Bo8o 
* | —3326400 -—2217600 —2217600 — 2217600 —1108800 —1108800 -— 3326400 -—2217600 —2772000 —1663200 
3} ay) 2772000 1108800 1108800 554400 : 1663200 277200 138%009 
‘if 739200 ° : ; 739200 369600 739200 369600 
@ @) — 739200 ~ s : : ? + | —369600 : — 369600 
4} Hs — 277200 — 498960 — 249480 — 166320 —4989600 -—2494800 —1247400 —831600 —415800 — 221760 - 55449 
( ) (2 i 1663200 3326. 1912680 1663200 4989600 4989600 3742200 2tosthos 2079000 1663200 665: 
é (2)? (1 —24 —24 —2910600 —2494800 + 2494800 —}3742200 —2494800 —2910600 — 2494800 — 1663200 
4) (2)* (1) 1247400 ‘ 1247400 1247400 31600 831600 
3} i —11098800 -—4989600 -—2494800 — 3326400 — 1663200 -831600 -—2217600 —138600 
AY a 3326400 1663200 3326400 3326400 1663200 1663200 3326400 3326401 
4} 3)A2 2)* ; s —831600 n : —831600 . —831 
*(1)| —1108800 ‘ s ; —554400 554400 
¢ % (1)? . 2494800 1247400 2494800 1247400 1247400 
(4)* (2) (1 ive } 4 . 1247400 3 . 1247400 —1247400 
uf 221760 332640 166320 133056 7983360 3991680 1995840 1330560 665280 332640 77615 
(5) (2 — 1330560 —1995840 -—1164240 —1330560 + —=3991680 -—3991680 —3991680 -2661120 —1995840 — 83160 
(s) (2)* HF 1995840 997920 1490880 1995840 1995840 1995840 997920 149688 
e P — 498960 " : ; . — 166320 
re aK 887040 2661120 1330560 2661120 2661120 1330560 2661120 177408 
(s) (3) vis — 2661120 —1330560 —2661120 — 1330560 - 266112 
2 887040 ° : : 44352 
(s) (3) —a 
(s) (4) (x) —2se ae 997920 3991680 — 1995840 —2993760 
(s) (4) (2) -.» Ss a 
(s)* (2) an eee 159667 
(6) (1)* - 6652800 -3326400 -1663200 -1108800 -—5§54400 -277200 — 554" 
-6 | 
(6) (2) (x)* : ieee 3326400 3326400 2217600 1663200 554400 
(6) (2)* (x) pe? j nee — 1663200 —831600 —83160 
2 3 - 2217600 -—1108800 -—2217600 —1108800 
(6) (3) (x) E: 18 = 
08800 1108800 
6) (3) (2 == 
(6) (3) (2) -6 30 —36 —18 36 66 66 j 
6 1 1663200 1663200 
6) (4) (2) =e - in tt . - ’ 
6 = 1330560 
(6) (s) 1s —60 jo 30 -6 jo —30 —30 3° 3° “yp 
7) (1)* “ ‘ ; ; -7 _ 4 ; . . : 
(7) (2) G7)? . -7 I 5 4 4 . 
2)" ‘ -7 2 —28 _ R 
(7) B & ‘ 5 ; -7 at ‘ = : : 
-2 2 ‘ - 2 —14 - 
@as : ’ : a 3 
(8) (2 & ' f ° " -8 24 —16 " > ‘ F 
3 24 » ° -8 32 —24 —24 24 > 5 
) G . -9 . -9 18 “ -9 : . A 
3 s > -9 18 -9 36 —36 -9 18 A , 
(10) (x ‘ —20 . 5 —10 30 —10 —20 » 10 ‘ 
(ir Ir -33 22 Ir -11 44 -33 -33 22 22 -u 











’ 








(4) 
(4) ( 
(4) 
(4) 
(4) (3) 
(4) 
(4) 
«a 


(s) 
(5) ( 


(5) 
(s) (3) 
(5) 
(5) 
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w=t11 (v) @,a,* @,a,a," @,a,* G1 Q3Q; a, a, @,a,;* Oy 0,0; A, Oy @,a,* yay 9% ay 
(1? 79 3960 198¢ 1320 330 990 495 165 110 5s Ir I 
(2) 1)® | —166320 ye —45540 —31680 —8910 —27720 — 14355 — 5115 — 3960 — 2035 —4095 —s55 
2)" (1)’ 831 ¢ 960 293 221760 77220 2071 117810 510 41580 22770 3° 990 
2)*(1)* | —831600 -831600 —665 - — 263340 —415 —311850 —173250 —13' - — 34650 — 6930 
2 fe 1)* rs 415800 623700 415800 311850 103950 259875 225225 103950 121275 $1975 17325 
3} (1) + —207900 + — 103950 - —“S2e75 —Sze7s - —SI975 ~1I0395 10395 
3) (1) 55 277220 138600 95040 25740 110880 5 18810 18480 9240 33° 
(3) (2) 37 | —3326400 —1940400 —1108800 -887040 -— 332640 —1108800 — - —277200 —147840 -5§ —9240 
i) {2 1)* | 1663200 2494800 2217600 2217600 1247400 1663200 1386000 831600 554400 277200 — 
(3) (2)* | —831600 —1663200 —1108800 —11 . —831600 -970200 --277200 -—554400 -—277200 —13 
& 2)* | ‘ ° 415800 . 207900 - ; 34650 138600 P x) 
(3)? (1)® 2217600 1108800 554400 554400 277200 1108800 55. 221760 3 I 92400 I 
a 2) (1)* + —1108800 —1108800 — 2217 — 1663200 —1108800 —110) — 1108: —1108800 —739200 —554400 —1 
3)* (2)* (1) 554400 554400 831 554400 1108800 . 554400 277200 277200 
(3)? (1)? . . 739200 739200 - 3 246400 123200 246400 123200 
3)* (2) a : ° 3 a ‘ - -3 + 123200 - 123200 
4) (1)? |—1663200 -831600 -—415800 -—277200 —71280 -—415800 — 207900 — 69300 —83160 —41580 — 13860 — 1980 
(4) (2) tis 4989600 3326400 2079000 1663200 665280 2494800 1455300 623700 831600 57380 207900 41580 
(4) (2)? (1)* + —2494800 —2910600 —2494800 — 1663200 —1247400 —I87I1100 ~—1455300 —1247400 —1039500 —623700 — 207900 
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CONTROL CHARTS WITH WARNING LINES 


By E. 8S. PAGE 
Statistical Laboratory, University of Cambridge* 


Process inspection schemes using control charts with warning lines are considered and the properties 
of some schemes based on the observations from the last few samples are evaluated. Tables of schemes 
for controlling the mean of a normal population are given. A method is suggested for controlling both 
the mean and standard deviation of a population on a single chart; examples are given when the 
population distribution is normal. 


1. INTRODUCTION 


A problem that arises in industrial applications of statistics is to detect changes in the para- 
meters specifying the quality of the output from a continuous production process, so that 
some rectifying action can be taken to restore the parameters to satisfactory values. A widely 
used scheme for this purpose (a process inspection scheme) is based on a control chart 
(Shewhart, 1931). Samples of fixed size are taken at regular intervals and a statistic of the 
sample (for example, the mean, the range, or the number of defectives) is plotted on the 
chart; if the sample point falls outside control limit(s) drawn on the chart, i.e. if the statistic 
differs by more than a given amount from its satisfactory value, rectifying action is taken. 
This scheme will be referred to as a single-sample scheme, since the decision whether or not 
to take action is based upon a single point on the chart; also in what follows the lines on the 
chart denoting a serious departure of the statistic from its satisfactory value will be called 
action lines instead of control limits. With any process inspection scheme there will in general 
be a delay between the point at which the parameters changed and that at which the scheme 
demands action. During this delay the production fails to meet the quality requirements. 
On the other hand, in practical cases, process inspection schemes, in particular single-sample 
schemes, cause action to be taken even when the parameters constantly maintain satis- 
factory values. In the one case there is a loss due to the production of substandard material, 
and in the other due to unnecessary interference with the production process. These losses, 
as a function of the parameters, may be used as a basis for the selection of process inspection 
schemes or for the comparison of two schemes. When the fraction of the production that 
is inspected is constant, a convenient assessment of these losses can be obtained from 
the average number of articles that are inspected before the scheme requires rectifying 
action when the parameters remain constant; this number, a function of the parameter 
values, has been called the average run length (Page, 1954a: ef. Aroian & Levene, 1950). 
We shall base the choice of inspection schemes on the behaviour of their average run length 
functions. 

Consider a single-sample control chart scheme for controlling the mean of a normal 
distribution. Suppose that the standard deviation, 7, of the process is known, and that the 
‘ideal’ value for the process mean is “. Then samples of size N are examined at regular in- 
tervals, and if the mean, 7, falls outside the action lines drawn at «+ B,o/ /N, rectifying 
action is taken. The current practice is to choose NV, usually small, and to use ‘three-sigma’ 


* Present address: University of Durham. 
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limits, i.e. to take B, = 3. These action lines are chosen so that the chance is about 0-002 
that a given sample point will cause action to be taken when the mean is at its satisfactory 
value 4. This consideration gives no guidance about the best sample size to choose nor any 
information about the average amount of time that will elapse before a change in the mean 
is noticed. If the fraction of production that may be inspected is fixed, for example, by a 
limitation on the man-hours for inspection, it is more realistic to choose the scheme with 
the most suitable average run length function. Tables giving values of N and B, have been 
constructed to enable this choice to be made easily for controlling the mean of a normal 
distribution (Page, 19546), and similar tables could be computed for other situations. It 
turns out that the best sample size to use is often much larger than those customarily taken 
in industrial practice, so that samples should be taken less frequently. But although the 
theoretically best sample size is large it may not be practically convenient to take samples 
of this size; further, the quality control engineer has got used to small samples and he may 
be reluctant to change his habits radically. There is, too, his (correct) feeling that a small 
sample will spot a very serious change in the mean, and for this reason he will wish to continue 
taking small samples frequently. It is therefore of interest to seek schemes of the control 
chart type that are easy to apply, that require only small samples to be taken, and yet retain 
the advantages of the single-sample schemes using large samples. One method is suggested 
by a modification of the single-sample scheme that has occasionally been used (Dudding & 
Jennett, 1944); in this a cluster of ‘moderately’ extreme sample points is treated as a single 
point outside the action lines and accordingly action is taken. A point is adjudged ‘moder- 
ately’ extreme if it lies between warning lines and the action lines.* Thus for controlling the 
mean of a normal population with known standard deviation, o, at an ideal value ~ warning 
lines would be drawn at «+ B,a//N, where B, < B,. 
We shall consider rules of the following type: 


I. Choose k,n, N. Take samples of size N. Take action if any point falls outside the action 
lines or if any k out of the last n points fall outside the warning lines. 


In the next section we show that in a certain sense it is reasonable to consider only two 
special cases of rule I, and in the following section their properties are evaluated. 


2. RESTRICTION OF THE RULES 


In the rules of type I only the region of the chart in which a sample point falls is taken into 
account when deciding whether or not to take action; the position of the sample point within, 
say, the warning region is not considered. For the rules it is only necessary to count the 
number of points in the various regions; however, for mathematical convenience it is useful 
to suppose that a score is assigned to each point according to the region in which it falls. 
A point outside the warning or action lines is obtained when the quality of the sample falls 
below the required level; if a mark or score is to be assigned to the sample such a defection 
will merit some penalty. On the other hand, a point within the warning lines will receive 
a bonus score. It is reasonable to base a process inspection scheme on these scores; an 
accumulation of penalties will be taken to indicate a deterioration in quality and suitable 
rectifying action will be taken. After each sample is taken the total penalty score over the 
last ‘few’ samples is examined and action is taken if it is ‘large’. In order to make the scheme 
precise, let a score x; be assigned to the ith sample, where x; = —a,b,c (a>0,c>b>0), 
* A chart using only warning lines has been considered by Weiler (1953). 
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according as the sample point falls within the warning lines, between the warning and action 
lines, or outside the action lines, respectively. The scheme is: 


II. Take action after the n-th sample if any of the inequalities 


Snir hesy (r= 0,1,...,8) 


ie 


t 


are satisfied, where s and the h, are suitably chosen constants. 


We consider the case where s is limited only by the number of samples drawn since action 
was last taken, so that all the partial sums, working back from the last sample, are examined ; 
and where h, = h, all r, so that the total scores in the last one, two, three, ..., samples are 
tested to see that none is greater than h. In an earlier paper (Page, 1954a) it was shown that 
this scheme is equivalent to a sequence of linear sequential tests (Barnard, 1946; Wald, 
1947) with initial score on the acceptance boundary; if the test ends on the acceptance 
boundary the test is reapplied, while if it ends on the rejection boundary action is taken. 
Consequently we consider what restrictions must be placed on & and n in order that the 
scheme I shall be equivalent to the repeated application of a linear sequential test with 
boundaries at (0,) and initial score zero.* 

First, we have that an initial sequence of (k — 1) points between warning and action lines 
has total penalty score (k — 1) 5; since this sequence does not require action to be taken 


(k—1)b<h. (1). 


If now this sequence is followed by (n—k) points within the warning lines each receiving 
a bonus score — a, and then by another point between the warning and action lines, scheme I 
requires action to be taken. Hence the total penalty score must be at least h, i.e. 


kb —(n—k)az>h. (2) 


Further, any sequence of (n—k-+1) bonus points (i.e. between the warning lines) is equi- 
valent to restarting the scheme; in particular, the set consisting of (k—1) points between 
the warning and action lines followed by (n— + 1) bonus points must have a total penalty 
score at most zero. Hence (k—1)b—(n—k+1)a<0; (3) 
indeed, if this inequality were not sa*isfied a finite number of such sets of n points would 
cause the total penalty score to exceed h and action to be taken, contrary to the conditions I. 
The combination of (1) and (2) gives 





bla>n—k, (4) 
and hence with (3) we obtain (n—k)<(n—b+l)/(k—1), (5) 
since k > 2 for the warning lines to be distinct from the action lines. It follows that 
(k—1)? 1 
= _—- 6 
mie i Homes = (9) 


For scheme I, & and n are integral and 2<k <n, so that (6) can be satisfied only for 
(i) k= 2, any n, 
or (ii) k=n. 


* The restriction on the initial score is unnecessary ; if the initial score is Z (0<Z<h), inequalities 
are obtained in a similar way and found to imply Z=0. 
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The two rules we consider therefore are: 

III. Choosen, N. Take samples of size N. Take action if two points out of uny sequence of 
n fall between the warning and action lines or if any point falls outside the action lines. 

IV. Choosen. N. Take samples of size N. Take action if n consecutive points fall between the 
warning and action lines or if any point falls outside the action lines. 


In these two cases it cannot happen that some sets of points not requiring action to be 
taken have greater total penalty scores than some sets leading to action; for any other 
values of k, n the anomalous position can, however, occur. The above theory admits the 
corollary that no other types of warning lines, more or less extreme than those considered 
above, may be introduced and yet permit the equivalence of a rule of type I based on the 
last n points, and a rule of type II. In the next section we derive the average run lengths of 
the schemes III and IV. 


3. THE AVERAGE RUN LENGTHS OF THE RULES 


Although we shall not use the following method. for completeness we remark that the 
average run length of the general rule of type I may be evaluated by enumerating the 
possible combinations of the last (n — |) points on the chart such that action has not been 
required. and treating the combinations as the states of a discrete Markov chain (e.g. 
Bartlett, 1953: Feller. 1950). The matrix of transition probabilities, P, may be written down: 
the vector giving the probability that the chart is in a given state after the rth point is 
p, = P’py. where p, is the vector specifying the initial state. Accordingly, the probability 
that action has not been taken up to and including the th sample is the sum of the elements 
of p,, i.e. l’p,. where l’ = (1. 1..... 1). Hence the probability that action is taken immediately 
after the rth sample is I'(p,__, — p,). so that the average run length is given by 


i.e. L=V(1-P)-p,N, (7) 


where N is the size of the sample. 

In the general case the size of the matrix P increases rapidly with n so that the labour 
involved in inverting I — P is considerable. Fortunately, the transition matrices for the rules 
III and IV are of a simple type and the states can be more simply specified so that the 
average run lengths can be found with little trouble. Let the probabilities that a given point 
falls between the warning lines, between the warning and action lines, and outside the action 
lines be po, P1, Pz respectively. Suppose that a suitable convention is made for points falling 
on the lines; for example, that any point falling on a line is to be regarded as falling into the 
adjacent more extreme region of the chart. With such a convention, clearly p)+ p, +p. = |. 

Consider first rule III; it has been shown that this rule is equivalent to a sequence of 
linear sequential tests with initial score on the acceptance boundary. By inspection it is 
seen that appropriate scores to assign to the sample point are given by 


a=1, b=a-l, c=, (8) 


and action is to be taken if the total penalty score rises a height A = n above its previous 
least value; or equivalently if the sequential test ends on the rejection boundary given by 





of 


us 
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h =n. When action is not required the position of the total penalty score relative to its 
previous minimum or of the cumulative score in the test is specified by one of the integers 
0,1,...,2—1. These may be regarded as the states of a Markov chain and the transition 


matrix P is seen to be rp, 0 0 pI 
pl, sth eerste ee 

P=|0 p 0 .. 0 O|. (9) 
pO) oO rasincsscif_os Oy 








The average run length is then given by equation (7). Alternatively, let L; be the average 
number of observations drawn before action is taken when the state isi. Then L,is theaverage 
run length of rule III. We have, by taking expectations conditional upon the result of the 
ppb ro: ays Lg = polN + Lo) + P(N + Ly_1) 

=N+polot+PLy+- (10) 


Similarly, we obtain the set of equations 


N 


Ly = N+poLot+PiLn1 
L,=N+poly 





(l<i<n—l). (11) 
L,=N+pli1 | 

The last n — 1 of these equations may be solved successively for Z,,..., Z,,_, in terms cf Lp. 

and the first equation used to evaluate L,. We obtain the average run length of rule ITI: 


(1=Po+Pi-— Pips *)N 
L= = n> 2). 12) 
(I=p)(1-Po-pps) 7”) . 
Scores for the various positions of the sample point on the chart for the repeated sequential 
tests equivalent to rule IV can be taken to be a = n—1,b = 1,c =n and h = n. There are 
again only n possible states before action is taken for the Markov chain. The average run 
length is Lo, where Ly is given by the equations 





L,= Savin Evens} . 
. (0<i<k-1). (13) 
Ly, =N+poly 
We obtain the average run length of rule IV: 
(1—p?)N 


onan (14) 
1—pPo— Pi + PoP 





4. CHOICE OF SCHEME 


In each of the two types of rule, III and IV, there are certain constants that may be chosen 
to give the scheme adopted some desired properties. Consider a two-sided chart with two 
sets of warning and action linessymmetrically placed about the ideal value for the parameter, 
or a one-sided chart with one warning and one action line. The disposable constants are then 
the sample size, N, the number of sample points considered, n, and the positions of the 
warning and action lines; the scheme may also be of one of two types. This may be con- 
trasted with the single-sample control chart scheme in which there are only two disposable 
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constants, the sample size and the position of the action lines. In the single-sample case 
the scheme could be chosen to give a specified average run length for some unsatisfactory 
value of the parameter and maximum average run length for the ideal value (Page, 19545); 
that is to say, the amount of scrap at a certain value of the parameter that can be tolerated 
is stated and the scheme is chosen to satisfy this requirement and to give the longest run 
without action being taken when the quality is satisfactory. A corresponding choice for 
a scheme with warning lines could be made by selecting two unsatisfactory values of the 
parameter and the average run lengths that can be tolerated at these values and then 
choosing the disposable constants to gain maximum average run length on the ideal quality; 
in this way the average run length will have desirable properties over a range of the para- 
meter. In order to illustrate this we consider schemes for controlling the mean of a normal 
population with known variance, o?, for two-sided deviations from the .deal mean. Suppose 
that it is desired to ortain a scheme with average run lengths of 100 and 25 when there are 
departures from the ideal mean of amounts 0-40 and 0-80 respectively. The large number of 
disposable constants would make it laborious to find the scheme of the types considered 
which has the greatest run length on ideal quality, but one scheme with approximately the 
run lengths stated above is that of type IV with k = 4, sample size N = 20, warning lines 
at 4 +1-50/,/N, and action lines at ~ + 2-8750/,/N. We can compare this scheme with the 
best single-sample schemes for detecting the two sizes of departure from the mean. From 
the tables given in an earlier paper (Page, 1954b, Tables 2a-c) the single-sample scheme 
having average run length 100 when the mean is « + 0-40 and maximum average run length 
when the mean is has action lines drawn at yu + 2-820/,N, and sample size VN = 70. Simi- 
larly, the single-sample scheme with action lines at ~ + 2-830/,/N and sample size N = 17 
has L = 25 at mean yu + 0-80 and maximum average run length at mean w. The average run 
lengths for the three schemes for several values of A, where the mean is 4 + Ao, are shown in 
Table 1. It is seen that the rule of type IV has approximately the same run length as the 
single-sample scheme with sample size 17 for large deviations ( > 0-80) from the true mean; 
on the other hand, its run length for smaller deviations (0-2 < | A |< 0-4) is rather less than 
that of the single-sample scheme. Again it is seen that, while the type IV scheme compares 
unfavourably with the second single-sample scheme (N = 70, B, = 2-82) in its behaviour 


Table 1. Average run lengths of control chart schemes 











Single-sample schemes Type IV scheme 

A n=k=3 

N=17 N=70 N=20 
B,=2-83 B,=2-82 B,=2-875 

B,=1-5 
0-0 3652 14600 3368 
0-2 780 560 545 
0-4 146 100 99 
0-6 47 71 42 
0-8 25 70 26 
1-0 19 70 21 
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on good quality, it gains considerably for the large departures (| A | > 0-4); clearly for a 
single-sample scheme to cause action to be taken at least one sample point must be plotted. 
The situation we are considering is where an amount N/f of production is output between 
visits from the inspector (where f is the fraction of output sampled); thus the relevant 
quantity to consider for even extreme deviations from the mean is the average run length, 
or approximately the sample size. 

The choice of the best sample size could be examined in the same way as in the earlier 
paper, but it would be extremely laborious to do so. In general, there will exist a value of 
N giving maximum average run length when the production has the ideal quality and 
specified average run lengths at two values of the mean. The effects of different sample 
sizes are illustrated by examples of rules of type IV which have average run lengths of 100 
and 25 at 4 + 0-40, u + 0-80 respectively; the sample size, N , the positions of the warning and 
action lines, B, and B,, the number, n, of consecutive samples considered, and the average 
run length, L, on the ideal mean are shown for five schemes in Table 2. There is, of course, 
an upper limit to the size of the sample that can be used for the scheme to have the desired 


run lengths. It is, however, again preferable to use larger samples if this is practically ; 
possible. 


Table 2. Average run lengths of type IV schemes 








N n B, B, L(0-80) L(0-40) L(0) 
5 3 3°25 1-25 25-3 118-0 584-6 
10 3 3-125 1-25 25-2 103-9 1096-9 
15 3 3-00 1-375 25-6 104-2 2286-4 
20 3 2-75 1-75 25-0 101-6 3155-9 
23 3 2-625 2-0 25-9 92-3 2607-8 





























5. THE TABLES 


In the Appendix are given tables of the average run lengths of schemes for controlling the 
mean of a normal population with known standard deviation. The run lengths are tabulated 
for some schemes of type LV with sample sizes 5, 10, 15 and 20, taking into account the means 
of the last three or four samples drawn (n = 3 or n = 4). In order to use the tables to deter- 
mine the suitable control chart scheme to be used it is necessary to decide first upon the 
average run lengths that can be permitted at two process means different from the ideal 
one, and which of the above sample sizes it is most convenient to use. A scheme approxi- 
mately satisfying these requirements may then be selected by inspection of the tables, 
possibly using some rough interpolation. The schemes shown in Tables 1 and 2 serve as 
examples of the method. 

It has been remarked that the use of moderately sized samples improves the average run- 
length function; consequently such samples are to be preferred where possible. For samples 
of ten and over the average run length for departures from the ideal mean of more than a 
standard deviation is near the sample size; this is easily seen since the probability that a 
point falls outside the action lines is nearly unity. 

Similar tables could be constructed for the average run lengths of schemes of type III. 
The two types of scheme are, of course, identical when the number of sample points 
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considered is two. A few calculations indicate that the two types of scheme have very similar 
properties and that, corresponding to a given scheme of one type, there exist schemes of 
the other type with approximately the same average run lengths. 


6. SIMULTANEOUS CONTROL OF THE MEAN AND STANDARD DEVIATION 


It is often required to control simultaneously both the mean and the standard deviation 
of anormal population. For example, when goods are being manufactured to a specification 
a change in either the process mean or standard deviation would alter the fraction of 
defective articles produced; one process inspection scheme for this situation (Jennett & 
Welch, 1939) is based upon the ratio (U —%)/s, where U is an upper tolerance limit, and 
Z, s? sample estimates of mean and variance. Another possibility is to use the easily cal- 
culated estimate of the standard deviation from the sample range, w, in place of s and so 
base the scheme upon (U —Z)/w. The more usual procedure in practice is to keep two charts, 
one recording the means and the other the ranges, of the samples. Before setting up any of 
these schemes it is important to consider both the type of rectifying action that will be 
taken and the changes that are likely to occur. If changes in the mean and standard devia- 
tion require different rectifying actions it is necessary that the inspection scheme indicate 
the type of change and not merely its existence. Again, the choice of scheme will be influenced 
by the relative frequency of occurrence and importance of the types of change. For example, 
if changes in the standard deviation are very infrequent there is little point in estimating 7 
from each sample as would be necessary in Jennett & Welch’s rule. The same considerations 
will hold to decide the average run lengths for various types of change that the scheme must 
achieve, These remarks lead us to consider the possibility of controlling both mean and 
standard deviation on a single chart for means when changes in the standard deviation are 
relatively rare or unimportant. The remainder of this paper is devoted to the development 
of such a scheme and to comparing it with a conventional scheme. 

The control chart scheme using charts for the mean and range are often of the conven- 
tional type with action lines drawn so that the probability that a sample point falls outside 
the action lines on a given chart when the parameters have their ideal values is an assigned 
amount, a (« = 1/500 gives the conventional 3-sigma limits on the mean chart). Alternatively 
and preferably, the action limits could be determined for each chart separately on the basis 
of the average run length as described above and elsewhere (Page, 1954), so that the losses 
incurred by the scheme may be estimated. This approach, of course, will not be very accurate 
for some values of the parameters since, for example, a change in the standard. deviation 
affects the run length of both charts. Consider a single-sample scheme for both charts so 
that action is taken if the mean of any sample, Z, falls outside the range (~—Bo/ /N, 
4+ Bo|JN), or if the range w is greater than v, 7, where v, is suitably chosen. Since the mean 
and range of samples from a normal population are independently distributed with fre- 
quency elements f(Z | u’, 0’) d%, g(w| a’) dw, where 4, a’ are the process mean and standard 
deviation respectively, the probability that a given sample does not cause action to be 
taken is q}, ¢3, where 


: #+BolVN > ors 
a=] f(E |’, 0')dz = 1—pi, (15) 
B— Bo|iVN 


qo -{' g(w|o’)dw = 1— py. (16) 





A 





) 


SET 


E. S. Pace 251 


Clearly 9}, 7, are the probabilities that the mean and range respectively fall within the 
action lines. The run length of the charts together is distributed as the smaller of two in- 
dependent geometric variates with parameters p;, p;: accordingly, the combined run length, 
l, is a geometric variable with parameter | —q}q3, 


Ptl=rNj = (4199)"* (1-91.92). (17) 
; N 
The average run length is thus L=—-;, (18) 
1-142 


or, in terms of the average run length of the charts separately. is 


bit, 


fe py SH 


(19) 
with an obvious notation. A numerical example is given in the next section. 

We turn now to investigate what control can be achieved by the use of the mean chart 
only, thus avoiding the need to calculate and plot ranges. A possible scheme makes use of 
the warning lines in another way. Consider the rule: 


V. Choose n, N. Plot the means of samples of size N on a chart on which are drawn two 
warning and two action lines. Take action if: 
(i) any point falis outside the action lines. 
or (ii) n consecutive points fall outside the warning lines, 
or (iii) two out of any set of n consecutive points fall outside opposite warning lines. 


A sequence of n points outside a warning line is evidence that the process mean has moved 
in the corresponding direction; similarly, the occurrence of the means of two near samples 
outside opposite warning lines points to an increase in the spread of the distribution. This 
scheme differs from one based on the range in that it depends on the variation between 
samples and not on that within samples. Such a scheme cannot be expected to be very 
sensitive in detecting increases in the standard deviation and will become less sensitive as 
the sample size increases: however, it may serve to keep a check on the standard deviation 
when control of the mean is of prime importance. In order to evaluate the average run 
length, Lo, of rule V let Li(L7) i = t....,.n—1. be the average further number of articles 
drawn before action is taken when the last i sample points have fallen between the upper 
(lower) warning and action lines. 

Let the probabilities that a sample point falls between the upper (lower) warning and 
action lines be 7(s). Then in the notation of §3 we have r+s = p,. By considering expecta- 
tions conditional upon the result of the first sample we have 


Lo = PolLo t+ N)+r(L,+N)+8(Li +N) +poN. 
Therefore Ly = Polo trl, +sly +N. (20) 


Again, by considering expectations conditional on the result of the next sample when 
the last ¢ points have fallen between the upper warning and action lines, we obtain 


Li, = Polo tthlintN (¢ = 1,2,...,.n—2), (21) 
Dos = Polio t+ N. (22) 
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A similar set of equations is obtained for the Lj. Hence we have 


_ (N+ poo) (1-7) 








Li ~_ (23) 

and similarly ol Bete) C > at} (24) 
These values substituted in (20) yield the solution 

i cal alas te a (25) 





a Po+rs(1+ po) + po(r™ +8" —sr" —rs")" 


A similar rule for simultaneously controlling the mean and standard deviation is provided 
by rule III without modification. If the two ‘warning’ points causing action to be taken 
are on the same side of the ideal mean a shift in the mean in that direction will be suspected, 
while if they are on opposite sides an increase in the standard deviation will be suspected. 
Of course with this rule, if a change in the standard deviation occurs and action is taken 
because of points outside the warning lines, the two points are just as likely to be on the 
same side of the ideal mean as they are to be on opposite sides. Again, action could be taken 
after a change in the standard deviation because of a point outside the action lines. Con- 
sequently the wrong sort of change is more likely to be suggested by the scheme when a 
change in the standard deviation happens. This is of little importance if the rectifying action 
is independent of the suspected type of change. However, the correct inference is more 
likely to be drawn with rule V than with rule III. 

The rules of this section have nothing corresponding to the lower control limit for the 
range sometimes drawn in the conventional method. The purpose of such a line is to enable 
the occurrences of samples with small range to be investigated in the hope that it will be 
possible to reduce the process standard deviation. It would, of course, be possible to 
introduce a condition calling for investigation if two successive samples both had means 
differing little from the required process mean; however, we shall not consider the con- 
sequences of such a complication. 


7. NUMERICAL EXAMPLES OF SIMULTANEOUS CONTROL 


Consider schemes to control the mean and standard deviation of a normal population. 
Suppose that both mean and range charts with action lines only are kept so that the average 
run length, L, of the scheme is given by equations (18) and (19). If the action lines are chosen 
so that the process inspection schemes formed by taking each of the charts on its own have 
equal average run lengths when both the process mean and standard deviation are at their 
ideal values, then the average run length of the rule using both charts with these parameter 
values is approximately half that of the separate rules. For different values of the mean 
while the standard deviation is the same, the run length of the range chart is unchanged; 
consequently the run length of the combination is a little less than that of the mean chart, 
and approaches it as the change in the mean increases. On the other hand, for different 
values of the standard deviation but the same mean, both L, and L, are changed. The 
average run lengths for a specific scheme with mean y’, and standard deviation o’ are shown 
in Tables 3 and 4. Let the action limits be drawn in the ‘conventional’ positions, i.e. so 
that the probability that a sample taken from ideal quality gives a point outside the action 
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line is 1/500 for each chart. Thus the lines on the mean chart are drawn at uw + 3-090/./N, and 
for samples of N = 5 the range chart has a line at w, = 5-240. We then have 


{ l soi we 
= —— e~ A 
% (-B-avN)/K V(27) — 


w,/Ko 
I. = I. g(w) dw. (27) 


For comparison, the average run lengths of one of the combined rules are shown in the 
tables; the scheme chosen has the same sample size, N = 5, and B, = 3-00, B, = 2-00, n = 3. 
With these values the combined rule has approximately the same average run length on 
the ideal quality as the scheme based upon both mean and range charts. It is seen that the 
rules gives fair control against changes in the standard deviation, and for changes in the 
mean has a smaller average run length than the rule using the two charts. Consequently 
when it is necessary only to keep an eye on the standard deviation while controlling the 
mean of a normal distribution a scheme of type V may be suitable. If so, the labour of 
keeping both mean and range charts can be avoided. 


Table 3. Average run lengths of schemes, N = 5, nw’ = p+Ao,o0' =o 











Rule using 
JA| Combined rule 
Mean Range Mean and range 
0-0 2500 2500 1250 1383 
0-2 1149 2500 789 767 
0-4 359 2500 314 257 
0-6 125 2500 119 90 
0:8 52 2500 51 38 
1-0 25 2500 25 20 























Table 4. Average run lengths of schemes, N = 5, we’ = wp. 0’ = Ko 











Rule using 
K Combined rule 
Range Mean and range 
1-0 2500 1250 1383 
1-25 198 158 241 
1-5 51 38 91 
1-75 24 18 50 
2-0 15 12 33 
2°25 ll 9 25 
2-5 9 7 20 
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A short table of schemes of type V is given in the Appendix, Table 2, for controlling a 
normal population with samples of five measurements, using at most a sequence of three 
such samples. The average run lengths for changes in the mean are shown in Table 2a, and 
for changes in the standard deviation in Table 26. 


I wish to thank Dr D. R. Cox for many helpful discussions on the subject of this paper, 
and the Director, Mathematica] Laboratory, Cambridge, for permission to use the EDSAC 
for the calculation of the tables. 
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APPENDIX 
Tables of average run lengths of rules IV 
Charts with action lines at » + B,o/,/N, warning lines at » + B,o/J/N. 
Table la. N=5 
























































n=3 n=4 
B, = B, 
A 1-00 1-25 1:50 1-75 2-00 1-00 1-25 1-50 1-75 2-00 x 

B= 0-0 202 503 1074 1598 1802 527 1190 1692 1830 1850 0-0 
3-00 0-2 137 285 5: 751 305 5 798 872 887 0-2 
0-4 63 102 160 221 261 112 175 236 269 280 0-4 
0-6 32 43 57 75 90 47 62 79 93 100 0-6 
0-8 19 23 2 2 38 25 29 34 39 42 0-8 
10 14 15 16 18 20 16 17 19 20 21 10 
1-2 ll 11 12 13 13 12 12 12 13 13 1-2 

1-4 9 9 9 9 9 8 8 8 9 9 1-4 
16 7 7 7 7 7 7 7 7 T 7 1-6 
18 6 6 6 6 6 6 6 6 6 6 18 
= 0-0 208 549 1326 2251 2691 579 1513 2451 2758 2806 0-0 
3-125 0-2 142 312 637 1008 1213 7 723 1100 1251 1283 0-2 
0-4 66 lll 184 274 344 122 300 360 382 0-4 
0-6 33 45 63 87 110 51 70 94 115 128 0-6 

0-8 20 24 29 36 44 27 32 39 45 51 0-8 
10 14 16 17 19 22 18 19 21 23 25 1-0 

1-2 ll 12 12 13 13 13 13 13 14 14 12 

1-4 9 9 9 9 9 9 9 9 10 10 1-4 

1-6 7 7 T 7 7 7 7 7 7 7 16 

1-8 6 6 6 6 6 6 6 6 6 6 18 

B,= 0-0 213 585 1575 3110 4041 621 1854 3518 4202 4319 0-0 
3-25 0-2 146 TAT 1331 1726 364 873 1504 1810 1881 0-2 
0-4 68 118 207 335 452 132 237 378 483 527 0-4 

0-6 35 48 69 100 134 54 78 110 143 165 0-6 

0-8 21 25 31 40 50 29 35 43 53 62 0-8 

1-0 15 16 18 21 24 19 21 23 25 28 10 

1-2 12 12 13 13 14 14 14 15 15 16 1-2 

1-4 9 i) 10 10 10 10 10 10 10 ll 1-4 

16 8 8 8 8 8 8 8 8 8 8 16 

18 6 6 6 6 6 6 6 6 § 6 18 

Table 1b. N=10 
n=3 n=4 
B, ” - B, 
A 1-00 1-25 1-50 1-75 2-00 1-00 1-25 1:50 1-75 2-00 ‘A 

= 0-0 387 896 1684 2249 2434 933 816 2333 2456 2473 0-0 
2-875 0-2 191 336 531 689 763 360 572 720 174 786 0-2 
0-4 70 90 119 149 171 98 127 156 175 183 0-4 

0-6 34 38 43 49 55 42 46 51 56 59 0-6 

0-8 22 22 23 24 26 24 25 25 26 27 0-8 

1-0 15 16 16 16 16 16 16 16 16 16 10 

B= 0-0 404 1005 2149 3197 3604 1055 2580 3384 3659 3700 0-0 
3-00 0-2 201 374 643 1044 405 709 963 1069 1095 0-2 
0-4 71 97 134 177 213 107 146 188 220 0-4 

0-6 35 40 47 55 63 51 58 65 70 0-6 

0:8 23 24 25 26 28 26 27 28 29 30 0-8 

1-0 16 16 17 17 17 17 17 17 17 18 10 

417 1097 2651 5382 1159 3026 4901 5516 5613 0-0 

209 408 761 1174 1436 447 862 1285 1491 1545 0-2 

74 104 149 264 116 165 277 306 0-4 

37 43 51 61 72 48 56 65 15 84 06 

24 25 27 29 31 28 29 31 32 34 0-8 

17 17 18 18 18 18 19 19 19 19 10 

426 1169 3150 6220 8081 1241 3707 7036 8404 8637 0-0 

216 437 879 1502 1986 1022 1700 2094 2208 0-2 

76 110 164 241 325 124 184 348 399 0-4 

39 54 67 82 51 60 73 87 100 06 

25 27 31 34 30 32 33 36 38 0-8 

18 19 19 19 20 20 20 20 20 21 10 
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APPENDIX (cont.) 
Table lc. N=15 
























































n=3 n=4 
B, S By 
1-00 1-25 1-50 1-75 2-00 1-00 1-25 1:50 1-75 2-00 A 
B= 0-0 548 1163 1929 2367 2492 1201 2034 2423 2505 2516 0-0 
2-75 0-2 208 324 457 556 601 344 483 573 307 615 0-2 
0-4 67 80 95 110 123 86 100 114 124 129 0-4 
0-6 35 36 38 40 42 39 40 42 43 44 0-6 
0-8 23 23 23 23 23 2% 23 23 23 24 0-8 
10 17 17 17 17 17 17 17 17 17 17 1-0 
B,= 0-0 581 1344 2526 3373 3651 1399 2724 3500 3684 3710 0-0 
2-875 0-2 222 365 552 714 800 392 594 748 814 831 0-2 
0-4 7 86 106 128 147 94 114 134 149 158 0-4 
0-6 37 39 41 44 47 42 44 46 48 50 0-6 
0-8 24 24 24 24 25 25 25 25 25 25 0-8 
10 18 18 18 18 18 18 18 18 18 18 1-0 
B= 0-0 607 1508 3223 4795 5406 1582 3570 5076 5489 5549 0-0 
3-00 0-2 234 403 656 914 1071 439 722 977 1101 1137 0-2 
O-4 74 93 117 147 175 103 127 156 181 196 0-4 
0-6 39 4l 44 48 53 46 48 51 54 57 0-6 
0-8 25 25 26 26 27 27 27 27 27 27 0-8 
1-0 18 18 18 18 18 18 18 18 18 18 1-0 
B= 0-0 626 1645 3977 6753 8073 1738 4539 7351 8275 8420 0-0 
3-125 0-2 244 438 765 1159 1439 482 862 1270 1502 1574 0-2 
0-4 78 98 128 167 208 110 141 180 218 244 0-4 
0-6 41 44 48 53 58 49 52 56 61 65 0-6 
0-8 27 27 27 28 29 29 29 29 30 30 0-8 
10 19 19 19 19 19 19 19 19 19 19 1-0 
Table ld. N=20 
n=3 n=4 
B, ‘s : c ihe B, 
1-00 1-25 1-50 1-75 2-00 1-00 1-25 1:50 1-75 2-00 A 
B= 0-0 730 1550 2571 3156 3323 160i 2712 3230 3340 0-0 
2-75 0-2 226 333 458 558 608 355 484 575 614 625 0-2 
0-4 70 179 90 102 lil 86 95 105 112 117 0-4 
0-6 37 38 39 _ 40 4l 40 40 41 41 42 0-6 
0-8 25 25 25 25 25 25 25 25 25 25 0-8 
10 21 21 21 21 2 21 21 21 21 21 1-0 
= 0-0 174 1792 3368 4497 4868 1866 3632 4667 4913 4947 0-0 
2-875 0-2 241 372 545 703 795 402 586 137 810 832 0-2 
0-4 74 85 99 115 129 93 106 120 132 140 0-4 
0-6 39 40 41 43 45 43 44 45 46 46 0-6 
0-8 26 26 26 26 26 26 26 26 26 26 0-8 
10 21 21 21 21 2 21 21 21 21 21 10 
B= 00 809 2010 4297 6394 7208 2110 476i 6768 7318 7399 0-0 
3-00 0-2 254 409 638 882 1045 7 701 943 1077 1121 0-2 
0-4 78 91 108 129 150 101 117 136 155 168 0-4 
0-6 2 43 45 47 49 47 48 49 50 52 0-6 
0-8 27 27 27 27 28 28 28 28 28 28 0-8 
10 22 22 22 22 22 22 22 22 22 22 10 
B= 0-0 834 2194 f 9003 10764 2317 6053 9802 11033 11226 0-0 
3-125 0-2 264 443 735 1095 1376 489 824 1200 1440 1528 0-2 
0-4 81 96 117 144 174 109 129 155 182 203 0-4 
0-6 44 46 48 50 53 50 52 53 55 57 0-6 
0-8 29 29 29 29 29 2 29 29 29 30 0-8 
10 22 22 22 22 22 22 22 22 22 22 10 
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APPENDIX (cont.) 
Tables of average run lengths of rules V 
Charts with action lines at 4 + B,o/./N, warning lines at 4+ B,a/,/N. Sample size, N =5. 























Table 2a. pw’ ="+Ao,0' =o. Table 26. vw’ = yw, o’ = Ko. 
By Oat B, 7 

4 1:50 1-75 2.00 - 1-50 1-75 2-00 
= 0-0 404 722 1021 o- 1-00 722 1021 
2-875 0-2 299 457 564 2-875 1-25 117 160 198 
0-4 128 170 196 1-50 69 79 
0-6 51 63 13 1-75 35 41 45 
0-8 25 29 32 2-00 26 28 30 
1-0 15 16 17 2-25 20 22 23 
Bi x 0-0 446 879 1383 Bins 1-00 146 879 1383 
3-00 0-2 344 517 767 3-00 1-25 12 186 241 
0-4 148 211 257 1:50 61 17 91 
0-6 57 74 90 1-75 38 44 50 
0-8 27 32 38 2-00 27 31 33 
1-0 16 18 20 2-25 22 23 25 
= 0-0 481 1033 1830 B= 1-00 481 1033 1830 
3-125 0-2 385 713 1037 3-125 1-25 140 213 292 
0-4 169 260 1:50 66 86 105 
0-6 63 86 110 1-75 41 48 56 
0-8 29 36 44 2-00 33 36 
1-0 17 19 22 2-25 23 25 27 
B,= 0-0 507 1173 2340 B,= 1-00 507 1173 2340 
3-25 0-2 422 856 1384 3-25 1-25 151 240 349 
0-4 188 314 438 1:50 71 95 120 
0-6 69 99 133 1-75 44 53 62 
0-8 32 40 50 2-00 31 35 39 
1-0 18 21 24 2-25 24 26 29 
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MISCELLANEA 


Approximations to the probability integral and certain percentage points of a 
multivariate analogue of Student’s ¢t-distribution* 


By C. W. DUNNETT} anp M. SOBEL{ 


Cornell University 


In a recent paper (Dunnett & Sobel, 1954), a multivariate analogue of Student’s ¢-distribution was 
defined as the joint distribution of p variates ¢,,, = z,/s (t = 1, 2,...,p). Here the z; have a non-singular 
multivariate normal distribution with means 0, common unknown variance o? and known correlation 
matrix {p,;} and ns?/o* has a y2-distribution, independent of the z,, with n degrees of freedom. The joint 
density of the ¢,, is given by 
Ai 
fltins -+-st pn) = - Thin + P)} 


l > —n+p) 
(nm? Tan) 1+ ~Zautintmn | ’ (1) 


where A is the determinant of the positive definite matrix {a,,;} = {p;;}-1. In the authors’ previous paper 
expressions and tables for the probability integral and equi-co-ordinate percentage points of (1) were 
obtained for the bivariate case (p=2). In that paper an equi-co-ordinate P-percentage point was 
defined as the value of h for which 


h h 
J ce Styne «--stpn) Utyy, ---dtgn = P. (2) 


In this note we shall derive approximations (which are also lower bounds) to the probability integral 
of (1) applicable in special cases when p > 2. These results then can be used to obtain approximations 
(which are also upper bounds) to any equi-co-ordinate P-percentage point. Equation (11) below shows 
that the probability integral table (Table 1) of the previous paper can be used when p;; = $ (17) to 
obtain numerically the approximations referred to above. 

Letting J = I,,(h,, hg, ...,h,) denote the left-hand member of (2) with the upper limit of ¢;, replaced by 
h, (¢ = 1,2,...,p), we have 


I = Pr{t,_<hy, ...:ton <hy} = Pr {z, <h,8, ...,2,<h,8; n}. (3) 


Fixing s as the last variable to be integrated, we can write 


[= I, G (hy 8, veeg hg; {Pis}) Fn(8) ds, (4) 


where G, = G,(2,,..-,% 3 {pi}) is the c.d.f. of the standardized p-variate normal distribution with 
correlation matrix {p,,;}, and f,(s) is the probability density function of s with n degrees of freedom. 


AssuMPTION 1. The matrix {p,,;} has the structure p,; = b,b, (i+7), where 0S6,<1 (i = 1, 2,...,p). 
It follows that {p,,} is positive definite, since the associated quadratic form & (1—b?) x?+(2b;x,)* is 
positive for x, not all zero. é ‘ 


AssumpTIon 2. The upper limits of integration in (3) are non-negative, i.e. h,=0 (t = 1,2,...,p). 

We note that when p,, = p20 (i+7), Assumption 1 is satisfied since this occurs when b; = ,/p 
(4 = 1,2, ...,p). 

If we let ¥,4,---,Y» denote independent, normally distributed chance variables with zero means 
and unit variances, and let c; = ,/(1 — 03), then the joint distribution of the chance variables 


2 = CYi—Fyq (t= 1,2,...,p) 
is a standardized p-variate normal distribution with correlation matrix {p,;}. 


* This research was supported in part by the United States Air Force, through the Office of Scientific 
Research of the Air Research and Development Command. — 

+ Now at Lederle Laboratories, Pearl River, N.Y. 

{+ Now at Bell Telephone Laboratories, Allentown, Pa. 
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Consider the function G, in (4) above, hold s fixed, and let a; = h,s (i = 1,2,...,p). Then 
G, = Pr{cyy,—biyg<a, (t= 1,2,...,p)} 


ao b; 
= iI a(= oe *) 9(Yo) 4Yo» (5) 


—o i=] 
where g(y) is the standard univariate normal density and Gy) is its c.d.f. 
It can easily be shown (see, for example, Kimball, 1951) that for any r non-decreasing, bounded 
functions F(x) (¢ = 1,2,...,r) of a chance variable x we have (letting HZ denote expectation) 


i F(a) jz, il E(F (x)}. (6) 
° Ps . P 2 a] a,+biY 
Applying (6) to (5) gives Ge a o(eetba 9(Yo) 4Yo 
p 
= Il Pr {c,y,— 5; Yo< a} = - I Gla; ). (7) 
i=1 


Substituting this result in (4) and applying (6) again gives 


a) pte 
fe Il G(h,8)f,(8)ds= Hf G(h,s) f,(8) ds 
0 i=1 i=1J/0 


p p 
= [J Pr{z,;<h,a;n} = [] Pr {t,,<h,}. (8) 
i=1 i=1 


This lower bound to J does not depend on {p,;} and is easily calculated from tables of the c.d.f. of the 
univariate Student ¢-distribution with n degrees of freedom. 

Under Assumptions 1 and 2 a sharper (i.e. higher) lower bound can be obtained by obvious modifica- 
tions of the above argument. This lower bound depends on the c.d.f. of the bivariate ¢-distribution 
considered by Dunnett & Sobel (1954) which is tabulated there for the special case p = }andh, = h,2 0, 
The results are for even p2 2, 


sp 
I2 TT Pr {tara,0 <Neai-1,9 bai, n<Aai}s (9) 
i= 
4(p—-1) 
and, for odd p23, T2Pr{t,,<hy} TT Pr {tes, n<Peis batar.n <ecsi} (10) 
i=1 


If we replace Assumptions 1 and 2 above by 


ASSUMPTION 1’. The matrix {p,;} has the structure p;; = p (i+j), where O<p<1. Clearly, {9,;} is 
positive definite. 


ASSUMPTION 2’. The upper limits of integration in (3) are all equal, i.e. hj = h (i = 1,2,..., 9); 
then we can replace (9) and (10) hy a single inequality and write for any integer p2 2 
T= [Pr ft, <h, ten <h}]}??, (11) 
which is sharper than (10) (when p is odd). The proof of (11) is similar to the above proof, using, instead 
of (6), the well-known inequality (see, for example, Cramér, 1946, p. 176) 
plezpie for p2=q, (12) 
where /, denotes the pth absolute moment. 


Paulson (1952) suggested an alternative method of obtaining an approximation and lower bound 
to (3) which is based on the Bonferroni inequality 


Pp 
Pr {tin <Ayy est pn <AgZSI— BY Prf{tyg>h,j}. (13) 
i=1 


This method has the advantage that it requires no assumptions. However, when the inequalities (8) 
through (11) hold, then (8) and hence also (9), (10) and (11) give sharper lower bounds than (13). To show 


this for (8) let 1-q, = Pr {tin <hi (14) 


so that 0Sq,;<5 1. Then we have to prove that 


Ha-wei-, Sa (15) 


i=1 
This certainly holds for p = 1, and a straightforward nny EI, induction shows that (15) holds 
for all positive integers p. 
17-2 
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The approximations (13), (8), (9), (10) and (11) can be used to obtain upper bounds to the equi- 
co-ordinate percentage points of (1) for p>2. Table 1 compares the approximations with the exact 
values for the special cases p;; = 4 (¢+7);n = 5,0;p = 3,9 and P = 0-50, 0-75, 0-95, 0-99; all entries are 
rounded to the nearest two decimal places. For n = o0 columns (13) and (8) require a table of the 
univariate normal c.d.f.; columns (10), (11) and the exact values were obtained from unpublished tables 
of the National Bureau of Standards (1953). For n = 5, columns (13) and (8) require a table of the uni- 
variate Student c.d.f. with 5 degrees of freedom; columns (10) and (11) and the exact values were com- 
puted by numerical integration, i.e. by applying Simpson’s rule and using the National Bureau of 
Standards tables (1953) to evaluate the integral in (4). The exact values for P = 0-95, 0-99 will also appear 
in another paper by Dunnett where equi-co-ordinate percentage points for the case p,; = $(i+7) will be 
given for p = 3(1)9, P = 0-95, 0-99 and n = 5(1) 20, 24, 30, 40, 60, 120, 00. 


Table 1. Comparison of exact equi-co-ordinate percentage points with 
approximations for selected values of n, p and P 


















































| 
p=3 p=9 | 
, | 
Fr n Approximations Approximations 
Exact Exact | 
values } values | 
(13) (8) (10) (11) (13) (8) (10) | (11) 
| | 
| 

0-99 5 | 4-46 4-46 4:37 4-32 4:21 5-75 5°75 561 | 5-59 5-03 

|; © | 2-71 | 2-71 2-70 2-70 2-68 3-06 3-06 3-05 3-05 3-00 
| | 
| | 
0-95 5 | 2-91 2-90 2-82 2-78 2-68 3-93 3-90 3-79 3-78 3-30 | 
co | 213 2-12 | 210 2-09 2-06 2-54 2:53 | 2-52 2-51 2-42 | 
0°75 5 | 1-62 1-55 1-46 1-42 1-32 2-48 2-38 2-28 2-26 1-81 | 
©O | 1-38 1-33 1-28 1-26 1-19 1-91 1-86 1-82 1-82 1-60 
0-50 5 | 1-07 0-89 0-80 0-74 0-62 1:93 | 1-71 1-60 | 1-58 | 1-10 | 
© 0-97 | 0-82 0-74 0-70 0-59 1-59 1-45 1:39 | 1:38 | 1:04 |; 

| | | | | | | 











For each of the approximations (13), (8), (10) and (11) in Table 1 it is conjectured that further calcula- 
tion will establish the following properties for the difference D = D(n, p, P) between the approximation 
and its exact value: 

(i) D is increasing with p for each n and P, 
(ii) D is decreasing with n for each p and P, 
(iii) D is decreasing with P for n = co and each p, 
(iv) D is parabolic-shaped with P for n = 5 and each p. 


The values of n, p and P in Table 1 were selected to cover a wide range of practical interest. Since only 
a limited number of exact values for finite n are known the inequalities considered in this paper, which 
have a fairly wide application, should prove to be useful. 
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Galton’s rank-order test 


By J. L. HODGES, Jr. 
University of California, Berkeley 


1. One of the first uses of rank-order in statistics was that of Galton in the study of data referred to 
him by Charles Darwin (1876). A quantity was measured on each of n treated subjects and also on each 
of n control subjects, obtaining measurements 2), %q,...,%p, ANA Y;, Yo, ---»Y, respectively. The 2n mea- 
surements were arranged in common increasing order of size, and Galton counted the number, say G, 
of times that an a of given rank exceeded the y of the same rank. In Galton’s case, n was 15 and G was 13, 
which was regarded as evidence that the first sample came from a population stochastically larger than 
that from which the second sample came. In modern language, if G is sufficiently large, we reject the 
null hypothesis that the treatment is without effect, in favour of the alternative that the treatment tends 
to increase the measurements. * 7 


2. Galton was not able to attach a significance level to his observation, inasmuch as he did not know 


the distribution of G under the null hypothesis. However, that distribution has recently been discovered 
in another connexion and proved to be very simple and elegant. Consider the usual penny-tossing game 
played by Peter and Paul, in which Peter pays one unit to Paul if the penny lands ‘heads’ and Paul 
pays one unit to Peter if it lands ‘tails’. Suppose we are given that after 2n tosses, the contestants are 
even. Let F denote the number of times that Peter was in the lead (where conventionally we assert that, 
when the game is even, the player leads who led on the preceding toss). If we identify Peter’s winning 
the kth toss with the event that the kth measurement in the Galton problem is an x, it appears that 
F= 26. 

The conditional distribution of F has been found by Chung & Feller (1949) to be uniform, under the 
hypothesis that Peter’s n victories are randomly distributed among the 2n trials. But this is just the 
distribution of the ranks of the x measurements under the hypothesis that the treatment has no effect. 
Therefore we may assert, under the null Lypothesis, that P(GSg) = (n—g+1)/n. For example, to 
Galton’s observation we may attach the significance probability 1/5. 


3. The proof of Chung & Feller is by means of a double generating function, and it may be of interest 
2 
to have an enumerative proof of so simple a result. Consider the class of (*") possible arrangements 


of n 0’s and n 1’s, and for each such arrangement determine its score g by counting those 1’s which precede 
the 0 of the same ordinal number. For example, in the sequence 01110001 the black 1’s are 
counted, since the second 1 precedes the second 0, the third 1 precedes the third 0, but the first and 
fourth 1’s are not counted as they follow the first and fourth 0’s respectively. The sequence just given 
has a total score g = 2. For a sequence of length 2n, the possible values of g are 0,1,...,n. We shall 
denote the score of a sequence (@, ... Gan) by [1 .-- Gan]. 


THEOREM. The number of arrangements with score x is independent of x. 
We shall prove this by defining a mapping T', which has the following properties: 


(a) The domain of 7’, is the set of all sequences of length 2m whose score is positive. 

(b) The range of 7’, is the set of all sequences of length 2n whose score is less than n. 

(c) 7, is a 1-1 function. 

(d) [7',(ay ..-@an)] = (a, ...dgn]— 1. 

As such a function gives a 1-1 mapping of the arrangements with score x on to those with score x— 1 
for x = 1,...,n, it establishes the desired result. 

We shall rieed to consider the first point in the sequence at which there have been equal numbers of 
0’s and 1’s (this corresponds in coin-tossing to the first toss at which the game is even). For each 


arrangement (a, ...,@3,) let k be the smallest positive integer such that a, +...+4, = k. We say that 
the sequence ‘breaks’ at 2k, and note that 
[ay «++ Gan] = [4 --. gx] + [Gan+1 --- onl (1) 


* Galton’s analysis has been extersively reviewed by Fisher (1945, Ch. 11), who points out that 
Darwin did not actually have two samples of n, but n matched pairs. It may be noted that the pairing 
did not serve to reduce the variance; in fact, if we test the hypothesis that there is no pair effect we have 
F 4,14 = 0-554. In any case, our present concern is with the problem of two samples of n. 
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Note that k may equal n. As 2k is the first equilibrium point, there cannot have been an earlier change of 
lead, so [a, ...@,,] must be either 0 or k. Further, if [a,...a_,] = k, we must have a, = 1, a,, = 0 and 
[@g... @gz_,] = K—1. 

We now define the functions 7’, inductively. Let 7,(10) = (01) and check properties (a)—(d) for this 
case. Suppose we have defined T',, for m<n satisfactorily. For any (a, ...@2,) of positive score, let 
T’,,(@, ... gn) be 

(i) (@y---Qg¢Tp—e(Gonss---Gan)) if [Ggx41--- Gen] > 0, 
(ii) (Oday) --- Aan Lag... Aex_1) if [Ggx41--- Aen] = 0. 


We must check (a)—(d), of which (a) is obvious. Condition (d) holds for (i) by the induction hypothesis 
and (1). As for (ii), if [@9,4; --. @g,] = 0, we must have 


O< [ay ... gy] = K = (Gy... Gqn), [Odgy 41 --- gn 1] = 0, (Gg... Age] = KR-1 = [Tp (Q, -.- Ggn)]- 


From (d) it appears that the range of 7’, is contained in the set of sequences with score less than n; any 
such sequence has a T’,-inverse under (i) if [@;,,.-.@e,]<"—k, since T,_, satisfies the conditions, 
and under (ii) otherwise, since when [d9;.,; ... dan] = »—k we-have 


No Be (bys «ig > Oo 5 Gy F3.0y. Mgy rev hy.t [ys <o Aguay) = Op A@yx0- Geis) = T ab LOeics 1 ++ Sqn OSg >< Tan-—1)- 
The ranges of (i) and (ii) are disjoint and each is invertible, so (c) holds. 


4. While the Galton test as defined above applies only when the two samples are of equal size, there 
is a natural extension to the more general case. For example, let nm, = 3, n, = 11. The third, sixth and 
ninth ranking observations in the larger sample divide it into four segments, each containing two 
observaticns. If we regard these three observations as ‘representing’ the larger sample, we may calculate 
G as befors between the two samples of three each, and it is clear that G thus defined has the uniform 
distribution, since to each arrangement for it there corresponds the same number (27) of arrangements 
of the original problem. In general, if n, = n,+k(n,+1). we may represent the second sample by its 
observations of rank k+ 1, 2k+2,....n,k+n,, and obtain a G which is uniformly distributed over the 
values 0,1, .... ”. 

If n,g—, does not happen to be divisible by n, + 1, the above method will not apply exactly, but we 
may still obtain a uniformly distributed test statistic by randomization. The representative values of 
the larger sample may be chosen so that the segments into which they partition it differ by at most one 
in size. If we select one among all such partitionings at random, the resulting G may be shown to be 
uniformly distributed. Because of the natural repugnance of randomized decisions in such problems, 
it is probably preferable instead to associate with each arrangement the distribution of G values it has 
among the partitions, and choose for its definitive G the integer nearest the expectation of this 
distribution. 

An interesting result follows if we let n,-. The population from which the second sample was 
chosen in then known, and we are dealing with the one sample rather than the two sample problem. Our 
statistic is now definable as the number of the quantities y,— F-'(n, + 1—k/n, + 1) which are positive. 
That its distribution is still uniform may be seen by a limiting argument from the above, or by con- 
sidering it as a set of conditioned partial sums. 
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On bounds for the normal integral* 


By JOHN T. CHU 
Institute of Statistics, University of North Carolina 


1. Let o= | emtettat (x>0). (1) 
0 


G. Pélya (1949) and J. D. Williams (1946) proved independently that 
v<4(1—e-22"/7)i, (2) 


Two simple questions follow naturally: (i) Is it possible to replace the constant 2/7 in (2) by a smaller 
quantity without breaking the inequality? (ii) Does there exist a lower bound, in a similar form, for v? 
We find the following answer. 

If for all x>0, the integral v given by (1) satisfies 


H1—e-™)icv<}(l—e), (3) 
then it is necessary and sufficient that 0<a<} and b>2/z. 
The proof of this statement-is simple. First, 


lim v?/(1 —e-2") = (2mb)-1. 

xr—0 
Hence if (3) is true, b>2/7. On the other hand (3) implies x?/[ —log (1 — 4v*)].< 1/a for all real x. Since 
the limit of this ratio, as x > 00, is 2, we have a< }. Finally 


x x 27 fz 
4v? = | | (27-1 e-Ks"+" dsdt >| | (277)-1 e-*”" rdrdé. 
-r@J -z 0 0 
Therefore v>}(1—e-t")A, (4) 


Pélya showed that as x varies from 0 to 00, the ratio of the t.H.s. (left-hand side) of (2) to the R.H.s 
decreases steadily from 1 to a minimum value and then increases steadily. Williams’s calculations 
indicate that, approximately, the minimum value 0-9930 is taken as x = 1-6. Using a similar method to 
that of Pélya, it can be shown that the ratio of the u.H.s. of (4) to the R.H.S. is a steadily decreasing 
function of x for all x>0; for the derivative of this ratio has the same sign as 


xz 

2(e8@* — ae { e-* dt, 
0 

g2n-l 


(5) 


3B @o 
which is -positive si 4a* ed ee ees ae 
icN 18 NoNn-positive since é [ie an rs. (Qn— 1) 


As a consequence, we obtain that this ratio (of the L.H.s. of (4) to the R.H.S) has an upper bound 2/7. 


2. A different lower bound for v can be obtained easily from a result proved by Chu & Hotelling (1954). 
There we showed that for all x>0, (1 — 4v2)/(4v%) < 4m. (6) 


Hence it follows that v>4h[2a?/(7 + 2x*)]!. (7) 


For easy reference, we will give here a proof of (6). Let 


Jo) = x*(1 — 4v*)/(4v*), (8) 
then lim g(x) = 47. We will show that g,(x) is decreasing. Let a prime denote differentiation with 
seapent. to x. Then 9o(x) = 2/(2v*) g,(22), (9) 
where 9,(x) = v(1—4v*) — av’, (10) 


* Work sponsored by the Office of Naval Research under Contract NR 042031 at Chapel Hill. 
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G:(x) = go(x) 0’, 
where 92(x) = x? — 12v?, 

g3(x) = (12/m) e-** gg(x), (11) 
where g3(x) = Gini a| 

2 Pe 1 

From (5), we have 93(2) =z [Z. es asa antl, 
It can be shown, by a similar argument used by Pélya for a similar purpose, that 

93(x) = 23(agx-* +a, +a,27+...), (12) 
where a)<0 and a,;>0, 7 = 1,2,.... Hence there exists an 2,>0 such that g,(x) <0 if O0<x<a, and 


93(x) > 0 if x>z». So as x increases from 0 to 00, g,(x) decreases steadily from 0 to a minimum and then 
increases steadily to 00. Consequently g,(x) first decreases steadily and then increases steadily. As 
limg,(x) = lim g,(x) = 0, it becomes clear that g,(x)<0 for all ~>0. Therefore g,(x) is a decreasing 
z—0 mo 

function of x. Hence we have (6). 

Comparison can be made easily of the two lower bounds for v given by (4) and (7). For simplicity, they 
will be denoted by a(x) and b(x) respectively. Now a(x) 3 b(x) according as ¢(x) = e4* — 2x2/m—1 50. 
As z varies from 0 to 0, c’(x), the derivative of c(x), changes sign from negative to positive. So does c(z). 
If x = 2, is the solution of c(x) = 0, then x) = 1 approximately (the exact value is slightly smaller). 
Therefore, the lower bound in (7) is closer to v than that in (4) if 0<x<1 (approximately) and less close 
ifx>1. 

Further, the following statement is of similar nature to the one made in § 1. 


If for all x>0, v>t[ax?/(1+azx*)]!, (13) 


then it is necessary and sufficient that 0 <a < 2/7. On the other hand, for no finite a can the r.H.s. of (13) 
be, for all 2 >0, an upper bound for v. 

The above statement can be shown easily by considering the limit, as x > 0, of the ratio of v to the 
R.H.S. of (13); and the limit, as x > 00, of (1 — 4v?) (1+ az). 


3. Several authors have derived inequalities for Mills’s ratio. Their results can be written in the form 
of bounds for the normal integral. For example, in our notation, Gordon’s (1941) inequalities are 
equivalent to 





fi l 
dd om-te-t# <v<-- 
2 2 


5 sy (ember for «>0. (14) 


Birnbaum (1942) improved Gordon’s upper bound in (14) and obtained 


4422)i— 
5 em te te for «>0. (15) 


v< 3 


More recently, Tate (1953) showed what amounts to 


1 e-**\t ete" 2 
(+55) “ame. ft ) for x>0. (16) 
We will now compare briefly (2) and (4) with (14), (15) and (16). The upper bound in (16) is obviously 
not so good as that in (2). The lower bound in (16) is non-negative for all real x. It is S the R.u.s. of (4) 
according as h(x) = x* — (8/m) (1—e-4*") 3 0, as will be seen by squaring the difference twice. As x varies 
from 0 to 00, h(x) decreases steadily from 0 to a minimum and then increases steadily to co; and vanishes 
at x = 1-01 approximately (the exact value is slightly smaller). Therefore the lower bound in (16) is 
closer to v than that in (4) if and only if z> 1-01 approximately. 

The lower bound in (14) is an increasing function of xz for all z>0. It is non-negative when x>0-65 
(approximately) ; and in this case it is S the R.H.s. of (4) according as g(x) = 2(2/7) x — x? — (2/7) et? S0. 
As x varies from 0 to 0, g(x) increases steadily from — 2/7 to a maximum and then decreases steadily 
to —o. The two roots of g(x) = 0 are approximately z = 0-5 and x = 1-45. Hence the lower bound in 
(14) is closer to v than that of (4) if and only if > 1-45 approximately. 








Oo e«-souwe Ss 
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Finally we point out that, for values of x close to 0, the upper bound in (2) is better than those in (14) 
and (15); while for large values of x, the latter two are better. No detailed comparison is attempted. 


The author wishes to thank the referee for calling his attention to R. F. Tate’s work and suggesting 
adding to the original note some comparison of the new and known results. Thanks are also due to 
Professor Harold Hotelling for his critical reading of the manuscript. 
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Substitutes for x? 


By J. B. 8S. HALDANE 
Department of Biometry, University College, London 
Neyman (1930) and Jeffreys (1948, p. 170) have suggested a substitute for y* involving some saving of 


computation. I here suggest what I believe to be a better one. If a sample consists of N individuals 
belonging to m classes, and n, belong to the rth class, the expected number on some hypothesis being 





m 
Na,, where & a, = 1, then 2. & (n,—Na,)* 
r=1 x= + ——_—-. 
r=1 Na, 
m (n,—Na,)* 
Neyman’s x2= = (m r) 
r=] Ny 
= — Na,)? 
I consider x2= = (m ir) : (1) 
r=1 nN, + 2 


Since there is a finite probability that any n, should be zero, it is clear that the expectation of y” 
is formally infinite. I shall show that it still exceeds m— 1 even when samples in which any n, = 0 are 
excluded. Haldane (1953) gave reasons for preferring n, + 1 as a divisor in a similar context. It can be 
shown that Na.)?-+6 

a[ pee = m—1+N-—[(b—c+ 2) Lazt—(3—c)m+1]+O(N-). 
r Ny +e 
Hence to avoid an infinite expectation c must be positive, and to avoid a multiple of Zaz!, which may 
be large, in the expectation, we must have b = c—2. The value 6 = 0 gives a simple formula, though 
b = 1 gives an expectation nearer to &(x”) when N is large. 
Let n, = Na,+a,. Then x? = N-1 Lataz}, 


~1 
x’? = N Data} 64+ 
ns Na, 


oo 
= xi+ p> [N-* X(—2,)*az‘], 
i=2 r 





Le + 2\-1 
vy"? = N-2 DY ata-"l waren & 
“ r ai ( ad << 


io 6) 
= x7+ 2 (N--! Dat —2,—2)' apt). 
i=1 r 
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To find the expectations of these quantities we require the expectations of powers of z,, namely, 
E(x,) = 0, E(x?) = Na,(1—a,), E (x3) = Na,(1—a,) (1—2a,), E (x4) = 3N%a2(1—a,)?+O(N). 


If we write &*(xt) to mean the expected value of x4 when n, is not zero, we omit the cases where x, = — Na,, 
which have a probability (1—a,)", which tends to zero quicker than any negative power of N. Thus 





Na,(1—a,)* Na,(1—a,) 1— N*%a(1—a,)* 
fox - coe, ae Pg Ee, Went, eel os ae 
é (z,) 1—(1—a,)’ 6 (x?) 1—(1—a,)¥ » etc. 
So &(x?) = N- Titian 
r=1 
&(x’*) = «, 
E*(y"?) = m—1+N-(2 Da-!—3m+1)+O0(N-), (2) 


hi ee oom " 
E(x?) = (m H(1 x) tO 2), 


Thus even if we exclude the samples where any 7, is zero, vy”? has a positive bias often exceeding twice 
the reciprocal of the smallest expectation. The bias of y”? is smaller, and readily calculated. The higher 
moments of the distribution of y”* and of y’”*, provided samples where any , = 0 are excluded, differ 
from those of y* by quantities of the order N-1. Errors of this order are neglected in the ordinary use 
of x”, and can be neglected in that of y”?, since y* would be used if great precision were required. 

As a numerical example, suppose that the numbers expected in four classes are 63, 21, 21 and 7, those 
observed being 71, 13, 16 and 12. Then yx? = 8-825, vy’? = 9-470, y”? = 8-319. If we reverse the signs of the 
deviations, so that the observed numbers are 55, 29, 26 and 2, we find y? = 8-825, 7’? = 16-832, 
x”? = 10-330. The addition of the bias 0-0268 to y”* gives values of 8-345 and 10-357, and this correction 
is clearly negligible. It is clear that y”? is a far better approximation than y”, and as it is no harder to 
calculate, it should be preferred. 
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A problem in the significance of small numbers 


By J. B. 8. HALDANE 
Department of Biometry, University College, London 


Dr H. Griineberg and Dr A. G. Searle have each presented me with the following problem. N mice 
were examined anatomically. Before examination they had been classified into m groups, containing 
yy Ng, Ng, «++y Nps +++) %m Members. The members of a group were in fact grouped together on the basis 
of common ancestry. They could have been grouped on the basis of phenotypic resemblance. In one and 
only one of these groups, containing n members, a mice were found with a specific anatomical abnor- 
mality. What is the probability that such a coincidence should occur as the result of random sampling? 

This question, like all such questions, is rather vaguely stated; and the answer to it may depend on 
the theory of probability adopted by the answerer. The answer here given is therefore not the only 
possible one. 

. We can arrange a fourfold table as follows: 


Normal Abnormal 





One group | n—a a 
Other groups N-n 0 | 





‘ 
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The application of Fisher’s (1935) ‘exact’ method gives 


_ (N—a)!n! 


~ N'(n—a)! 





P, 


as the probability that all the abnormals should belong to this particular group. If they had all belonged 
to any one group this would have attracted notice provided P, was fairly small. If, however, one group 
had consisted of about half the total, that is to say n = }N approximately, and a had not exceeded 3, 
the question of significance would not have arisen. The question can therefore be put somewhat more 
concretely: ‘What is the probability that if there are a or more abnormals, all should occur in a group 
with n or less members?’ Or still more definitely: ‘If there were just a abnormals, and each group con- 
sisted of m members, what is the probability that by an accident of sampling, all the abnormals should 
be found in one group?’ 
The probability that all the abnormals should be found in one group is 
(N-a)!™ n,! 


1 ae a Wr = — ty 
N! pa (M,—a)! 





If each group consists of just m members, which implies that N/n is an integer, this probability is 


ee N a N(N —a)! n! 7 (N—a)! (n— 1)! 
eo fon, wNUn—a)t (N—1)iin—a)t. 

[ suggest that this is a reasonable estimate of the probability of the observed event, or of one equally 
or more unlikely, even if the vaiues of n, are unequal. We see that if a = 1, P, = 1, which is reasonable, 
since it is certain that the one abnormal will fall into one of the groups. We can then frame the question 
as follows: ‘The first abnormal individual was found in a certain group. What is the probability that the 
a— 1 abnormals found among the other NV — | mice should also be members of this group?’ Clearly the 
value found, i.e. P3, is a reasonable answer to this question. 

In the case propounded to me by Dr Searle, N = 472, n = 23,a = 2,m = 4. So 


—____— = -—_ = 0-047. 
WN2ZI! 471 


470! 22! 22 
4 
The uncorrected value of P is P, = 0-0023. At least one of the other groups must have consisted of 
150 or more mice. Had the two abnormals belonged to this group no question of significance would 
have arisen. The estimation of P as about 0-05 rather than about 0-002 allows for the fact that, if the 
coincidence in question is an effect of random sampling, a number of other comparable coincidences 
would not be considered significant. 
Any test of significance is somewhat arbitrary. For example, in place of the y? test where 
wan § eS 
x; ye? ’ 
r=1 &(a,) 


where a, is an observed number and 4(a,) its expectation, we could use 


&, | a, — 6 (a,) | 
r=1 [&(a,)}* 


or many other similar statistics. y? is used because it has a fairly simple distribution. On the same data 
the alternative might give a higher or a lower value of P. The test here suggested is easy to apply, and 
answers a reasonable question. 

I think, however, that the problem merits a fuller discussion, and that a solution based on a different 
approach might be of equal or greater value. 
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Bounds for the ratio of range to standard deviation 


By GEORGE W. THOMSON 
Ethyl Corporation, Detroit 20, Michigan 


1. This note supplements the work of David, Hartley & Pearsoa (1954) on the distribution of the ratio 
of the range w to the standard deviation estimate s, where both are from the same sample of size n. 
Bounds are shown to exist for w/s for all populations with non-: 2: variance and percentage points are 
given for samples of three from a normal population. 


2. The bounded nature of the distribution of w/s has not beer aoted by any of the authors who have 
investigated this statistic*. It can be readily shown that the upper and lower bounds for w/s for samples 
from any population with non-zero variance arise from certain simple configurations of sample points. 
The upper bound, which corresponds to minimum s for a given range w, results from the arrangement 
with n— 2 of the points at the sample mean and the other two points at equal distances from the mean. 
The lower bound, which corresponds to maximum s for a given w, results from the concentration of half 


Table 1. Bounds of the distribution of the ratio of range to standard deviation, 
w/s, in samples of size n from a normal population 



































| Percentage points Percentage points | Percentage points 
| | 
n n | n | 
| Lower Upper | Lower | Upper | Lower Upper 
| 0% 0% | 0% 0% | 0% 0% | 
ION. | i uy “ee = 
| 
| 10 1-897 4-243 30 1-966 7-616 
11 1915 | 4-472 40 | 1-975 8-832 
12 1-915 | 4-690 50 | 1-980 | 9-899 
3 1-732 2-000 13° | = 1-927 | 4-899 60 | 1-983 10-863 
4g ohiocqepgg 2-449 14 | 1-927 | 6-099 80 | 1-987 12-570 
| 100 §—-:1-990 14-071 
5 1-826 | 2-828 15 | 1-936 | 5-292 
6 | 1-826 3-162 16 | 1936 | 5-477 150 | 1-993 17-263 
7 1-871 3-464 17 , ase | 5-657 200 | 1-995 19-950 
8 1-871 3-742 18 | 1-944 5-831 500 1-998 31-591 
9 1-897 4-000 19 1-949 6-000 1000 | 1-999 44-699 
20 1949 | 6-164 | 
| | | 








of the sample points at one extreme and the other half (plus one, if the sample size is odd) of the sample 
points at the other extreme. The numerical values of the bounds can be shown to be: 


Upper bound of w/s: /{2(n—1)}. 
Lower bound of w/s: 2,/{(n—1)/n} for n even, 
2 /{n/(n+1)} for n odd. 


As n becomes larger the lower bound approaches 2. It is also evident that these bounds are distribution- 
free provided that the points can be distributed at all. Table 1 shows the numerical values which corre- 
spond to the lower and upper 0 % points in Table 6 of the paper by David et al. 


* [Editorial Note. The existence of these limits has no doubt been noticed by others who have 
considered the distribution of this ratio; in the correspondence leading to the joint paper by David, 
Hartley & Pearson (1954), the first author gave these limits in a letter of February 1954, but they 
were omitted in the published paper. E.S.P.] 
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3. Certain properties of the distribution of samples of size 3 frem a normal population have been 
obtained by Lieblein (1952), as noted by Seth (1950). It is easily shown from Lieblein’s results that the 
percentage points of w/s for n = 3 are given by 

2cos [30° (1 — F)], 
where F is the cumulative frequency. Thus, for the upper 10 % point, F = 0-90, w/s = 2cos 3° = 1-99726. 
Table 2 shows the upper and lower 0-0, 0-5, 1-0, 2-5, 5-0 and 10-0 % points and the median. The upper 
percentage points agree to the third decimal place with those given by David et al. (1954, Table 6). 


Table 2. Percentage points of the distribution of the ratio of range to standard 
deviation, w/s, in samples of size 3 from a normal population 


Lower percentage Upper percentage 
points w/s points w/s 
0-0 1-73205 10-0 1-99726 
0-5 1-73466 5-0 1-99931 
1-0 1-73726 2-5 1-99983 
2-5 1-74499 1:0 1-99997 
5-0 1-75763 0-5 1-99999 
10-0 1-78201 0-0 2-00000 
Median 
50-0 1-93185 


4, The bounds for w/s have been used in our research laboratory for the past twelve years in routine 
checks of the computation of s for small samples, since gross errors can be detected at once. 
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On the estimation of population parameters from marked members 


By J. A. GULLAND 
Fisheries Laboratory, Lowestoft 


In commercial fish populations there exist two distinct types of mortality, that due directly to capture 
by man, and that due to other causes. Forsimplification it is not unreasonable to assume that these may 
be represented by the constant exponential coefficients, denoted by F and M. It is the separation and 
estimation of these which forms a major part of any study of such populations. 

Typically, the total population is of the order of millions, and recaptures from a marking experiment 
consist of all the marked fish caught by any fisherman and recaptures may therefore be considered as 
a continuous process rather than at discrete intervals, as in most experimental trapping (e.g. Hammers- 
ley, 1953). If a known number of marked individuals is released, then direct maximum likelihood 
estimates for the two mortalities F and M (in the marked population) may be obtained. 

For the probability of an individual remaining alive until time ¢, when released at time ¢ = 0, is 
e~F+M)t, Hence the probability of recapture in interval (¢,¢+dt) is Fe~"+tdt, Suppose now N in- 
dividuals are released, of which n are recaptured at time t,,¢,, ...,¢,, and that these are the only recaptures 
up to time 7' (7'>allt), when the experiment was ended. Then the probability of being recaptured is 


—(F+M)t d; ‘ 1 —+F+M)T 
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and the probability of not being recaptured is 








_M + et F+M)T 
F+M F+M 
The likelihood function may therefore be written } 
N\ 2 M F N-n 
eb = ( Fe-?+u — poF+M)T 4 
TL (etree ) 


N 
Hence L= log ( ) +nlog F—(F + M) Xt;+(N—n) {log(M + F e+?+T) — log (F +M)}, 
n 





ob on (1—FT) e+? +T 1 
— = —— Lt, + (N—n){ ——_ _— 
oF F M+Fe-f+T ~FiM 
aL (* — FT e?+M)T l 

and aM = Lt;+ (N- n) M {FP e-Ps mir - ri] ° 


If the experiment is continued so long that e~?+™)T may be neglected, then these equations become 
very simple, namely, 
oL sn N-n oL + (N-—n)F j 


= —-Xt; > w= ht, —— 
oF F * F+M’ @aM (F+M)M 
Putting these equal to zero gives the solution 








A n2 A (N—n)n 
F=——, M= 
Nxt; Nit; 





These estimates are biased, in fact having infinite expectations and higher moments. However, Lt,/n } 
is an unbiased estimate of (F + M)-', when eF+M)T is neglected, and in practice, for reasonably large 
values of n and N the distribution of F and M for repeated experiments is likely to be quite reasonable. 
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REVIEWS 


Biomathematics. By Crpric A. B. Smira. London: Charles Griffin and Co. Ltd. 
Pp. xv+712. 80s. net. 


Biomathematics, by the late Dr W. M. Feldman, appeared in 1923 and reached a second edition in 
1935. The present book is modestly described by Dr Smith as a third edition. But it has been entirely 
rewritten and is, in effect, a new book. 

Although advanced statistical theory uses nearly all branches of mathematics, except perhaps pro- 
jective and differential geometry, it draws on them to very varying extents. The student of statistics 
can go a long way with only a nodding acquaintance with analytic geometry and differential equations, 
but finds a knowledge of calculus and advanced algebra almost essential from the outset. He therefore 
requires a special course of mathematics for statisticians and will not find it within the covers of any 
single book. There is a great and growing need for a work of this kind. 

Dr Smith’s book does for biologists what one would like to see done for statisticians in general. It 
provides a thorough grounding in the ideas and techniques necessary for those who want to be mathe- 
matical biologists without going so far as to be biological mathematicians; and it does so with great 
fluency and insight without any sacrifice of rigour. There can be few writers as well qualified as Dr Smith 
to write such a work. His mathematical powers, his extensive knowledge and practical experience of 
mathematical and statistical applications in biology, the care he has bestowed on the work and his 
gifts as a teacher combine to make this a work of outstanding excellence. It will be useful not only to 
biologists but to any student who requires a competence in mathematics for the purposes of pursuing 
his own subject. 

The book opens with two chapters on arithmetic, including some calculating devices and some 
account of mechanical equipment and punched card machines. Chapter 3 is a refresher course on 
algebra, leading to the treatment of inequalities in Chapter 4. Chapter 5 deals with the connexion 
between algebra and geometry and provides a foundation in analytic geometry and conic sections. 
The next two chapters deal with logarithms (which Dr Smith introduces by means of the equi-angular 
spiral) and their applications in computation by slide-rules and nomograms. Chapters 8-12 are on 
differential and integral calculus and proceed as far as simple differential equations and partial differ- 
entials. Chenter 13 deals with series and Chapter 14 with vectors. Methods of solving equations and 
matrices receive separate chapters, and the book closes with three chapters on Chance, Statistical 
Distributions and Simple Statistical Procedures, together with one on Colson’s method of simplifying 
arithmetical calculations. The book contains over 700 pages and the amount of ground covered is 
astonishing considering that no impression is given of hurry or over-consideration. 

Readers of Stephen Leacock will remember his protests about the dullness of arithmetical examples, 
and his attempt to dispel it by investing with human attributes those three anonymous characters of 
our youth A, B and C; especially the case of poor C, who died of pneumonia contracted while trying 
to fill a cistern with a leak in it. Dr Smith, who combines with his other qualities a sense of humour, 
would have endeared himself greatly to Leacock. For instance, the additive properties of matrices 
are illustrated on the population of Narkover, the determination of minima is exemplified, not on 
those eternal salmon tins, but on the angles of origin of human arteries; the treatment of physical 
magnitudes is enlivened by an example leading to the conclusion that if one falls into a cold sea the 
best thing to do is to swim hard so as to keep warm. 

Altogether, this is a very sound, readable and sensible book. In spite of its price it should become 


widely popular. M. G. KENDALL 


Introduction to Mathematical Statistics. 2nd edition. By Pau G. Honi. New York: 
John Wiley and Sons Inc.; London: Chapman and Hall. 1954. Pp. ix+331. 40s. 


This is a much expanded and revised version of a book that first appeared in 1947. Apart from some 
expansion of many topics such as regression, the analysis of variance and non-parametric methods, 
there have been notable additions in the probability field. As a result the book is much more self- 
contained than formerly. It presupposes a knowledge of calculus—probably about two years study— 
and seems designed for scientists who have some mathematical background and wish to master the 
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elements of statistical theory. For the applications to their own fields, they will have to turn to other 
texts. 

Inevitably in a book of this size some things have had to be omitted, for example, time series, 
equalization of variance and probit analysis. It is perhaps surprising, in view of the intended public, 
that no mention is made of Sheppard’s corrections for grouping or the many short-cut procedures 
available nowadays, such as the range in the analysis of variance. There is also an unusual differentia- 
tion between large sample and small sample distributions—is the F distribution a small sample 
distribution? Diagrams are used excellently throughout the book and are well labelled. There are 
numerous exercises for which no answers are provided. The book ends with a number of useful tables, 
but whilst the percentage points of t, F and x? are given, those for the normal curve have to be obtained 


by inverse interpolation in the table of the cumulative distribution provided. P, G. MOORE 


Design and Analysis of Industrial Experiments. Edited by O. L. Davies. London and 
Edinburgh: Oliver and Boyd, for Imperial Chemical Industries Limited. 1954. 
Pp. xiii+ 636. 638. 


This book is written by a team of authors largely the same as those who wrote Statistical Methods in 
Research and Production in 1947, but, as explained in the Introduction, the present volume deals with 
the design of experiments and their subsequent analysis rather than the extraction of information 
from previously existing data. The particular point of the book is that it is written from a broadly 
chemical point of view, rather than the more usual agricultural one, and so the design of experiments 
is freed from the restrictions imposed by agricultural conditions, and sometimes inadvertently carried 
over into fields where they do not apply. 

Successive chapters deal with Simple Comparisons, Sequential Tests—a remarkably fine chapter 
—Sampling and Testing Methods, Randomized Blocks, and Incomplete Randomized Blocks, again 
very good because it concentrates attention on the simpler designs required for industrial experi- 
mentation. It is nice to have the assurance that these are worth while, for the alternative view that 
in them too few observations have to bear a heavy load of theoretical interpretation is somewhat 
prevalent. Then there are four chapters on Factorial Experiments and one on the determination of 
optimum conditions. In this last chapter the treatment is largely orographical, which I like as it 
enables simple words like ‘ridge’ to be used as short-cuts avoiding much difficult mathematical 
explanation. Then follows a glossary which is far the worst part of the book. No readers of a book of 
this type need to have explained whet is quaintly called ‘The arithmetic average or mean’, and if 
they do the definition here given will hardly help them, nor is it possible to explain ‘Universe, Popula- 
tion, Parameter, Sample, Statistic’ in fourteen lines. The final tables are well arranged. 

The book as a whole contains a very large amount of valuable information, but as might be expected 
from a team the style varies greatly in clearness and readability ; certainly the best parts are very good. 
There is an implied assumption that nothing but the experiment and its own conditions need be 
considered in planning, and this is certainly not always true for those who work in less magnificent 
concerns than I.C.I., but still you must know how the experiment really ought to be done, before 
planning a compromise with what can be done, and the necessary information is here. 

Finally, it must be emphasized that in industry an investigation is not complete when the results 
have been obtained in conventional statistical form. It is necessary to translate them into the language 
used by the executive who is to decide what action to take on them. It is avowedly not the purpose 
of this book to deal with this question, but there is internal evidence that at least one of the authors 
would be well able to do so, and I should like to suggest that in a future edition such a chapter should 
be added in place of the glossary. 


L. MCMULLEN 

Sample Survey Methods and Theory. By M.H. Hanszn, W. N. Hurwirz and W. G. 
Mapow. New York: John Wiley and Sons Inc.; London: Chapman and Hall: 1953. 
Vol. 1, Methods and Applications. Pp. xxii+ 638. 64s. Vol. um, Theory. Pp. xiii+ 
332. 56s. 


These books aim to cover the whole field of the sample survey and cater for all tastes and all classes of 
statistician. Vol. 1 sets out the methods in common use in sampling surveys and the way in which the 
methods are applied. It begins with commonsense talk about the fundamental aims of sampling and 
sampling design and continues with the delineation of such statistical ideas as may be expected to be 
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useful to the particular purpose of the authors. The chapter on bias and non-sampling errors in survey 
results is obviously written from practical experience. Much of the book is concerned with descriptions 
of the various types of sampling—simple random sampling, stratified simple random sampling, simple 
one or two stage cluster sampling, stratified single or multi-stage cluster sampling. In each case the 
procedure is clear, every possible query which might be raised by the would-be user is answered. The 
chapter on estimating variances and the accuracy of the method of estimation will not be helpful 
unless the reader has learned a little elementary statistics but presumably if he has reached this point 
in the book he will willy-nilly have acquired such basic information as he will here find to be necessary. 

Vol. 1m can either be read in conjunction with Vol. 1 or separately when it can be looked on as an 
elementary text-book exercise in statistical algebra. The standard here is uneven. It is surely un- 
necessary to explain ‘why we study summation notation’ (p. 11) to someone who is expected to under- 
stand without explanation why we study convergence in probability (p. 72). None the less in spite of 
irrelevancies the algebraic framework is clear and concise and should not cause undue difficulty to 
any student interested enough to read the book. Certain contractions are possibly invented by the 
writers and should be rejected ; ‘plim’ standing for ‘limit in the sense of probability’ is one of the least 
useful of these. All the formulae used in vol. 1 are derived here, and the way is clear for the reader to 
make such modifications as his own particular sampling design may dictate. 

The two books taken together cover fully the whole field of sample surveys, and are unlikely to be 
superseded for some years to come. This being the case it is unfortunate that such encyclopaedic works 


should be priced out of the possible range for many persons. ¥; MN. DAVID 


Sampling Theory of Surveys with Applications. By Panpurane V. SuKHATME. 
New Delhi: The Indian Society of Agricultural Statistics; Ames, Iowa: The Iowa 
State College Press. 1954. Pp. xxviii+491. $6. 


This is the fifth book dealing specifically with sample survey methods and theory to appear in the last 
five years. The author is head of the statistics branch of F.A.O. and has wide experience of sample 
surveys in India and elsewhere, and one might have expected his book to be more closely concerned 
than it is with the methods of overcoming the difficulties arising in agricultural surveys and in surveys 
of underdeveloped territories. In fact, Dr Sukhatme has -vritten a text-book on the algebraical develop- 
ment of the standard branches of sample-survey theory. On this basis the treatment is painstaking 
and clear and the bouk will be most valuable to the student or research worker who requires an ad hoc 
treatment of the theory appropriate to any particular sampling method. 

The up-to-date character of the book is shown by the prominence given to unequal-probability 
sampling, the treat-nent of which is based in part on the author’s own research, and by the chapter on 
non-sampling errors, which contains some novel results on the treatment of interviewing errors. 

However, it is to be regretted that Dr Sukhatme in common with other recent writers on sample 
survey theory has neglected the opportunity to base his treatment of the theory on the simplifying 
principles set out by Yates five years ago in Sampling Method for Censuses and Surveys. Writing mainly 
for practical workers rather than for theoreticians—no proofs of the formulae were given—Yates 
based his exposition on the fact that most survey designs are built up of the following components: 


(a) Method of selection, e.g. simple random, systematic, probability proportional to size. 
(b) Stratification. 

(c) Use of supplementary information. 

(d) Multi-stage sampling. 

(e) Multi-phase sampling. 


The number of combinations of the various possibilities under each of these heads is so large that 
it would be impossible to give within reasonable bounds the appropriate estimation and variance 
formulae for all sample designs likely to be useful in practice. However, Yates pointed out that the 
contribution from each component could be considered separately and the results combined to give 
the relevant formulae for any particular design. 

As an example, the estimate of a mean based on a stratified sample normally takes the form 
y = =N,9,/XN;, where 9, is the estimate of the ith stratum mean. Suppose that by referring to the 
theory of unstratified sampling under (a) above, we were able to calculate an estimate sf of V(9;,). 
Yates would immediately write down the estimate of V(y) as ZN?s?/(ZN,)*, and would assert that this 
result held generally whatever methods are used for sampling within strata. Sukhatme, however, has 
preferred to start from the beginning for each type of stratified sample and to develop explicit formulae 
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from first principles for each case, rather than prove a general rule for all types of stratified sample. 
Similarly, in his treatment of both the ratio method and multi-stage sampling, the author has pre- 
ferred to treat each case separately from first principles rather than to develop the general rules by 
which the estimates can be built up. This is all the more regrettable since the general laws are easier 
to prove than the formulae for special cases. 

A further opportunity for simplification was missed in the treatment of sampling with unequal 
probabilities with replacement. If we draw a sample of n with replacement from a population of 
values 4, Y2, ---» yw With selection probabilities p,, p,, ..., py, all necessary estimation and variazce 
formulae can be obtained by regarding the sample as a simple random sample from the discrete 
probability distribution in which the observation y; has associated with it the probability p,. Thus, 
once the theory of sampling from an infinite population with equal probabilities of selection has been 
worked out the corresponding results for unequal-probability sampling with replacement emerge as 
corollaries. In Dr Sukhatme’s treatment, however, these results are worked out afresh. 

For these reasons the book contains a great deal of avoidable algebra. Although some students 
undoubtedly find it helpful to have an ad hoc theory set out for each particular type of sample design, 
the consequent details are likely to prove unnecessarily time-consuming for the reader whose object 


is to grasp the principles of sample-survey theory. >. Dome 


Life and other Contingencies, Vol.1. By P. F. Hooker and L. H. Loneiey-Coox. 
Cambridge University Press. 1953. Pp. viii+312. 22s. 6d. 


This book is one of a series commissioned by the Institute of Actuaries and the Faculty of Actuaries 
to provide a course of reading suitable for the examinations conducted by these bodies. The present 
work replaces Spurgeon’s Life Contingencies, which now ends an honourable career of some forty 
years. The authors have evidently been influenced to a considerable extent by their distinguished 
predecessor. The main changes arise from the demands of the new actuarial syllabus and consist of 
the introduction of sections on extra risks, valuation methods and non-mortality (sickness, maternity, 
etc.) benefits. There are also a number of additions necessitated by developments in theory and prac- 
tice, such as the treatment of family income benefits and the appendix on International Actuarial 
Notation. 

However, the book is by no means merely a rather thorough revision of the earlier text-book. It 
is a completely new work and develops the subject in its own way. Emphasis is definitely on the pro- 
vision of methods for practical applicetion in life-office work. There is little preoccupation with com- 
plicated algebraic manipulation, and the theoretical study of mortality laws and stationary populations 
is reduced to a minimum. Worked examples are collected at the ends of chapters and, perhaps for this 
reason, appear to be less profuse than in Spurgeon’s text-book. There are no exercises to be worked 
out by the student. 

The treatment of multiple decrement tables is left to vol. m1, and it will be interesting to see how 
the authors tackle this still somewhat controversial topic. As a result of this division of subject-matter 
joint-life assurances and annuities are not considered in the present volume. 

It is to be hoped that ‘Hooker and Longley-Cook’ will be as long-lived as ‘Spurgeon’. If this is so 
the reviewer hopes to see in the successive revisions a much fuller treatment of non-mortality benefits, 
and, if possible, either an earlier and more adequate discussion of the construction of tables, or the 


omission of this subject by its transfer to another part of the actuarial syllabus. N. L. JOHNSON 


Population Statistics and their Compiiation. By Huan H. Wotrenpsn. The Uni- 
versity of Chicago Press, for the Society of Actuaries. 1954. Pp. xxii+ 258. 56s. 6d. 


This book is a completely revised edition of a work originally published as ‘Actuarial Study No. 3’ 
by the Actuarial Society of America in 1925. The choice of subject-matter clearly reflects the actuarial 
interests of the author. In the analytical section of the book there is much greater emphasis on the 
study of mortality than on any other aspect of population phenomena. No fewer than 70 pages are 
devoted to the discussion of various methods of constructing mortality-tables and life-tables, while 
a further 23 pages are concerned with the comparison of mortality by occupations, causes, etc. and 
a brief discussion on forecasting mortality rates. There are only eight pages on the use of data on 
marriages, births, orphanhood & unemployment and three pages on ‘Sickness Data’. Emigration and 
immigration are not specifically considered at all. The theory of reproductivity is relatively adequately 
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treated in 16 pages containing some sensible warnings on the uncritical use of the various indices of 
reproductivity. 

The earlier part of the book shows much less evidence of this lack of balance. There is an interesting 
and useful description of the compilation of population statistics by means of censuses, and registra- 
tion of births, marriages and deaths. This is accompanied by a discussion of likely sources of un- 
reliability in the data, and methods of correcting the consequent defects. 

Comprehensiveness has evidently been the author’s ideal. While he has succeeded in making his 
book a mine of information on particular topics, it is to be expected that students may experience 
difficulty in sorting out really important material from items now mainly of historical interest. The 
dust-cover claims that ‘this book is the only presentation, by an actuary, of the particular actuarial 
viewpoints and methods necessary to the production of modern population statistics’. Apart from the 
fact that this statement ignores books published by the Institute of Actuaries in recent years, the 
catalogue-like assembly of methods presented makes difficult the disentangiement of the more modern 
methods. 

Despite these criticisms, the book should prove useful as a comprehensive, yet handy, work of 
reference. Thereis no index, though there is a detailed ‘Table of Contents’ by paragraphs (not by pages). 
There are nearly one hundred footnotes which bear witness to the author’s devotion to the ideal of 
comprehensiveness. Most of them contain useful information, though they tend to interrupt the 
textual line of argument. 

There is a 22-page appendix on ‘Some Theory in the Sampling of Human Populations’, by 
W. E. Deming. It is well written but will be of little value except to those with sufficient statistical 


training and such persons will probably be familiar with the subject already. N. L. JOHNSON 


Table of Binomial Coefficients. Royal Society Mathematical Tables, 3. London: 
Cambridge University Press. 1954. Pp. viii+ 162. 35s. 
These tables, prepared under the editorship of J. C. P. Miller, were computed mainly at the National 


Physical Laboratory, Liverpool University, and the Royal Aircraft Establishment at Farnborough. 
They give "C, for 


(i) r<4n<100, 


(ii) 2<r<12, 200<n< 500, 
(iii) 2<r<ll, 500 << 1000, 
(iv) 2<r<65, 1000 <n < 2000, 
(v) r=2, 3, 2000 <n < 5000. 


The computations throughout are performed by use of the recurrence relation 
st1C, = *C,+*C,_;. 


The main application of the table lies in the field of number theory for such investigations as to the 
possible uses of tetrahedral numbers to form other numbers. For statisticians the combinatorial numbers 
mainly conjure up visions of binomial probabilities and here, as the probabilities have to be combined 
with the combinatorial number, the volume is unlikely to supplant such well-worn friends as the in- 
complete beta function, at any rate for the lower values of n. Occasions do, however, arise where bi- 
nomial coefficients alone are required—certain probability problems of a combinatorial nature for 
example—and these tables will form a useful reference work for such cases. The volume is well set out 
with very clear type, and a useful index is provided to aid one in locating the value required. 


P. G. MOORE 


Bessel Functions and Formulae. Compiled by W. G. Bicktzy. London: Cambridge 
University Press, for the Royal Society. 1953. Pp. 11. 3s. 6d. 


This collection of formulae is a straight reprint of the Summary of Notations and the section on ‘Func- 
tions and Formulae’ from the British Association Mathematical Tables, Vol. x, Bessel Functions, 11. 
It gives a summary of the differential, integral and difference relations obeyed by the Bessel and 
auxiliary functions and also some connecting formulae with related functions (e.g., Hankel’s, Whittaker’s, 
ete.). It further gives a variety of series expansions for and involving Bessel functions. 
Of particular interest to statisticians are the sections giving generating functions and Laplace 
transforms of Bessel functions. D. E. BARTON 
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PUBLICATIONS OF U.S. DEPARTMENT OF COMMERCE, 
NATIONAL BUREAU OF STANDARDS 


(i) Tables of Chebyshev Polynomials S,,(x) and C,(x). Applied Mathematics Ses, 9. 
1952. Pp. xxix+161. $1.75. 


Defining the nth order polynomials as 
C,(x) = 2cos{arccos}a}, S,(x) = 2(4—2*)-#sin {(n + 1) arc cos $2}, 


the main tables gives their values to 12 decimal places for m = 2(1) 12,2 = 0(0-001) 2. Subsidiary tables 
give the expansions of these polynomials in powers of x, together with those of the modified forms: 


T(x) a $C,,(2z), T(x) = 4C,,(42—2), 
U,(2) = S,(22), U(x) a S,(40— 2). 


Also given are inverse relations for powers of x as linear sums of T',(x) and T'¥(x) up to the twelfth power. 

These polynomials are chiefly of use as ancillaries to the computing of functions possessing power 
series expansions over a whole range of values of the argument, as the convergence of a series of Cheby- 
shev polynomials is more uniform and in general more rapid than that of a power series expansion taken 
about a point of the range. The computation was carried out under the direction of A. N. Lowan and a 
22-page introduction is written by Cornelius Lanczos. This gives, inter alia, the example of the asymp- 
totic expansion of the incomplete normal integral, where the Chebyshev series to six terms gives one-fifth 
the error of approximation of the power series expansion of the same order, for the possibility of deviating 
more than ,/2 standard deviations from the mean. These polynomials are distinct from those commonly 
used in statistical practice and also described as Chebyshev’s, defined by 


rare of (40-9) (HH) 


(ii) Tables of coefficients for the numerical calculation of Laplace transforms. 
Applied Mathematics Series, 30. 1953. Pp. 36. 25 cents. 


These tables give coefficients for the approximate evaluation by quadrature of the Laplace transform 


F(p) = i) * fit) e-mtdt 
0 


of a function f(t) which is given for n equally spaced values of t whose range includes the effective range 
of the argument. The coefficients are given to nine decimal places for n = 2(1) 11 and varying ranges and 
intervals of p depending on n (e.g. p = 0-1(0-1) 1-0 for n = 2, p = 1(1) 10 form = 11). Special tables are 
also given for the simpler case where f(t) is a low order polynomial. A short introduction is provided by 
H. E. Salzer. In this he remarks: ‘Convenient estimates of the error of approximation seem difficult to 
obtain.’ However, in the example given of f(t) = J,(¢) it is found that for n = 11, F(p) differs from the 


approximating function by less than 0-14 % over the range of p considered. D. E. BARTON 


iii) Tables of Lagrangian Coefficients for sexagesimal interpolation. Applied 
Mathematics Series, 35. 1954. Pp. ix+157. $2.00. 


These tables give 3-, 4-, 5- and 6-point Lagrangian interpolation coefficients A, for arguments in sexa- 
gesimal measure, such as angles given in unitsof degrees, minutes and seconds. The coefficients are given 
for 3600 values of the fraction of the tabular interval. Thus if the function is tabled for each degree, the 
coefficients may be used to find a value of the function at any required minute and second. Each coeffi- 
cient is tabled to eight decimal places. There is a brief Introduction by H. E. Salzer. 
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(iv) TzSies of circular and hyperbolic sines and cosines for radian arguments. 
Applied Mathematics Series, 36. 1953. Pp. x+407. $3.00. 


The main Table I gives for x = 0(0-0001) 1-9999 values to nine decimal places of the functions sin z, 
cos x, sinh z and cosh x. There are three short supplementary tables: 


Table II gives values to nine decimals of the same four functions for x = 0-0(0-1) 10-0. 
Table III is a conversion table for expressing degrees, minutes and seconds ir radians and vice versa. 
Table IV gives to 15 decimal places values of n x 47 for n = 1(1) 100. 


There is a brief Introduction by A. N. Lowan. 


(v) Table of secants and cosecants to nine significant figures at hundredths of 
a degree. Applied Mathematics Series, 40. 1954. Pp. vi+46. 35 cents. 


This tables gives sec x and cosec x for = 0-00 (0-01) 90-00 degrees. It will serve as a companion to the 
table of sin x and cos x to fifteen decimal places at hundredths of a degree previously published as No. 5 


in the Applied Mathematics Series. E. S. PEARSON 


CORRIGENDA 
Biometrika (1954), 43 


(1) M. E. Wiss, p. 328. For equation (8-5) read: 
Nex = (N—4n+}) (1-h) +4(p—1) 


(2) P. WuiTTLe, p. 437. On the left-hand side of equation (16) 
for §—a&.,—b&. read £,—ab, ,—bé,,, 


(3) H. A. Davin, p. 466. In Table 2, the result for the rectangular population 
with n = 4 
for 1-019 read 1-010 


(4) D. R. Cox, p. 472. In Table 2, for population (6), the value for n = 4: 
for 1-961 read 1-939, and forn=5: for 2-252 read 2-196 


The Corrigenda to papers by Rushton and Ruben stated in the List of Contents for 
Vol. 41, Parts 3 and 4, as printed on p. 568, were wrongly placed in front of the first 
page (p. 287) of that issue. 
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